Lecture Notes in 
Computer Science 



1738 



C. Pandu Rangan V. Raman 
R. Ramanujam (Eds.) 



Foundations of 
Software Technology 
and Theoretical 
Computer Science 

19th Conference 

Chennai, India, December 1999 

Proceedings 





Lecture Notes in Computer Science 1738 

Edited by G. Goes, J. Hartmanis and J. van Leeuwen 




Springer 

Berlin 

Heidelberg 

New York 

Barcelona 

Hong Kong 

London 

Milan 

Paris 

Singapore 

Tokyo 




C. Pandu Rangan V. Raman 
R. Ramanujam (Eds.) 



Foundations of 
Software Technology 
and Theoretical 
Computer Science 



19th Conference 

Chennai, India, December 13-15, 1999 
Proceedings 




Springer 




Series Editors 



Gerhard Goos, Karlsruhe University, Germany 
Juris Hartmanis, Cornell University, NY, USA 
Jan van Leeuwen, Utrecht University, The Netherlands 

Volume Editors 

C. Pandu Rangan 

Indian Institute of Technology 

Department of Computer Science and Engineering 

Chennai 600036, India 

E-mail: rangan@iitm.ernet.in 

V. Raman 
R. Ramanujam 

Institute of Mathematical Sciences 
C.I.T. Campus, Chennai 600113, India 
E-mail : { vraman,j am} @imsc.ernet.in 

Cataloging-in-Publication data applied for 



Die Deutsche Bibliothek - CIP-Einheitsaufnahme 

Foundations of software technology and theoretical compnter 
science : 19th conference, Chennai, India, December 13 - 15, 1999 ; 
proceedings / [FST and TCS 19]. C. Pandu Rangan . . . (ed.). - Berlin ; 
Heidelberg ; New York ; Barcelona ; Hong Kong ; London ; Milan ; 
Paris ; Singapore ; Tokyo : Springer, 1999 

(Lecture notes in computer science ; Vol. 1738) 

ISBN 3-540-66836-5 



CR Subject Classification (1998): F.l, F.3-4, G.2, F.2, D.2, D.l, D.3 
ISSN 0302-9743 

ISBN 3-540-66836-5 Springer- Verlag Berlin Heidelberg New York 



This work is subject to copyright. All rights are reserved, whether the whole or part of the material is 
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, 
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication 
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, 
in its current version, and permission for use must always be obtained from Springer- Verlag. Violations are 
liable for prosecution under the German Copyright Law. 

© Springer-Verlag Berlin Heidelberg 1999 
Printed in Germany 

Typesetting: Camera-ready by author 

SPIN: 1074993 1 06/3142 - 5 4 3 2 1 0 Printed on acid-free paper 




Preface 



his ol m on ins h pro ings o h 1 on r n 

(on ions o o w r hnology n h or i 1 omp r in) or- 
g niz n r h spi so h Indian Association for Research in Computing 
Science (http:/ /www.imsc.ernet.in/~iarcs). 





his y 


r’s on r n 


r 


s missions 


rom s m ny s 




i r n 


0 


n ri s. 


h s mission w s r i 


w y 1 s 


hr in 


p n 


n r 


r s. 




r w 


-long -m il 


is ssion 


h progr m 


ommi 


m 


on 


h 7*^ 


n 


th Q 


g s 1 


in h nn 


i n s 1 


0 p p 


rs or 


in 1 


sion in 


h 


on r 


n progr m. 


h n 


h progr m 


ommi 


m m 


rs 


n h 




i w rs 


or h ir sin r 


or s. 













r or n oh in i sp rs his y r pro i ing or ry 

r i progr m Martin Ahadi ( 11 L s - L n hnologi s lo 1 o 

) Lila Kari ( ni . s rn n rio n ) Jean-Jacques Levy ( 
ris r n ) Micha Sharir ( ni . 1 i sr 1) n Seinosuke Toda ( 

o yo p n) . Mor o r h on r n is pr y wo- y wor shop ( 

m r 11 1 1 ) on Data Structures ns y wo- y wor shop 

( m r 1 -17 1 ) on Foundations of Mobile Computation, h on r n 



Iso r s wo oin s ssions wi h h International Symposium on Au- 
tomata, Algorithms and Computation ( mrlll hnni) 

Monika Henzinger ( omp q ys ms s r h lo 1 o ) is pr s n ing 



ori 1 on 


Igori 


hmi s s h 1 s no 




n 


Kurt Mehlhorn 


(M X- 1 n 


- ns i 


r r" n rm ny) is gi 


ing 


1 on Igori hm n- 


gin ring s 


h rs 


no . h n 


11 h 


in i 


sp rs or 


gr ing 0 


1 h 


on r n s w 11 s or pro 


i ing 


s r 


s n r i 1 s 


or h pro 


ings. 










h ns i 


0 M 


h m i 1 i n s n h 


n i n 


ns i 


0 hnol- 


ogy 0 h 


h nn i 


r o-hos ing h on r n 


h 


n 


h s ns i s 


s w 11 s h 


0 h rs who x n n n i 1 s ppor 


g n i 


S 0 


h 0 rnm n 



o n i n pri so w r omp ni s in n i . 

hn hmm rso h org nizing ommi or m ing i h pp n. 

p ilhnsgoohs oor ins i sn olr omnno pring r- 
rl g or h ir h Ip wi h h pro ings. 



pmrl .n ngn(Mrs) 

n sh m n ( M h nn i) 
m n m 




Conference General Chair 

n R n n (IIT, Madras) 



Program Committee 

n-K (IIT, Delhi) 
n (IMSc, Chennai) 
o on (UNU/IIST, Macau) 

(IIT, Kharagpur) 

(IIT, Delhi) 

n n (CUNY, New York) 
n o no (U. Chicago) 

R n (IISc, Bangalore) 

Ko on (U. Edinburgh) 

K (U. Maryland, College Park) 

K K n (IIT, Madras) 

K N n K (SMI, Chennai) 
n (U. Waterloo) 

K o (IMSc, Chennai) 

N n n (SUNY, Albany) 

R R n (King’s College, London) 

n R n (IMSc, Chennai) ( Co-chair) 

R R n (IMSc, Chennai) (Co-chair) 

R (IIT, Bombay) 

R (U. Illinois at Urbana- Champaign) 
N n n (SRI, Menlo Park) 

o o (SUNY, Stony Brook) 

n o on (IIT, Bombay) 
on o (RWTH, Aachen) 



Organizing Committee 

00 (IIT, Madras) 

K o (ATI Research, Chennai) 
R R (IIT, Madras) 

K R n n (MCC, Chennai) 




List of Reviewers 



o 

n n 

no o 
R 

n n 

n n 

n n n 

n-K 

n 

n 

non on 

R n n 

0 0 n 
o n 

n 

n 

n no 
K n n n 

n 

n on 

o 

o 

n 

o 

o on 
n n 
o 
o 

o 

on 
N o 



o 

o 



Ro o 



o 

n 

o 



o 

nn 

no 

n 

n n 
n o no 



N n 

o 

o on 

no n n 

n 

o n 
n 

R n n 

R n 

n 

nn 

Ko on 

o 00 n 

o 
n 

no no 

R n 

R n n 

o no 

on 

RK o 

o n 
n n 

n n 
K 
K 

Ro K n 
n K nn 

K 

K n 
o K 
Ko 

o n Kon nn 
Ro n Ko 
K n 00 

K K n 



n K 



n K 

K N nK 

n K n 

K 
on 

n n 

K n 

n nn 

n 

n o 
n 

n n 

K o 

o o n 

n no 

nn 

o n 

o 

n n 

n 

n 



o 

o n 
Ron n 
n 
o 



o 

n 

n 

n o 



n 



o 

o 

o n 

n n 

n 



n n 

R o N n 

N o 

N n 

N n n 
N n 




VIII List of Reviewers 



n N 

o n N n 

no n on 
Ro 



non 

nn o 

nn 
n 
o 

n 

R n n 

R 

R n 

R R 

R R n n 

R n 

R R n 
n R n 

R R n 
R 

R o 
R n 

- R n 
R R 

R n 
R 
R 
Ro 



n Ro 
Ro n 
Ro o 
R 

o n R 
n 



o 

n 

n o 
on 

R n 

n n 

n 



N n 

n 

n 

n 

RK 

R o 

n 

R n 

R n o on 



o o 
n o on 
o o 

n n n 
n 







n n 


K 




n 


K 




n n 






n n 




n 


nn 






n 


0 






0 


n 


0 




0 


n 






n 




0 




n 


n 


n 


K 




r 


0 






N 


n 


n 




n 





o n 
K n 

n n n 

o 
o 

n - n 



on 



o o 

on o n 



R 




Table of Contents 



Invited Talk 1 

Recent Developments in the Theory of Arrangements of Surfaces 1 

Session 1(a) 

Dynamic Compressed Hyperoctrees with Application to 

the N-body Problem 21 

Largest Empty Rectangle among a Point Set 34 

Session 1(b) 

Renaming Is Necessary in Timed Regular Expressions 47 

P pp 

Product Interval Automata: A Subclass of Timed Automata 60 

p 0 P. . T j 

Session 2(a) 

The Complexity of Rebalancing a Binary Search Tree 72 

o 

Fast Allocation and Deallocation with an Improved Buddy System 84 

0 

Session 2(b) 

Optimal Bounds for Transformations of w- Automata 97 

0 0 

CTL”*" Is Exponentially More Succinct than CTL 110 

T o 

Invited Talk 2 

A Top-Down Look at a Secure Message 122 



o 



0 



o 




X 



Table of Contents 



Session 3 



Explaining Updates by Minimal Sums 142 

K 

A Foundation for Hybrid Knowledge Bases 155 

0 



Session 4 

Hoare Logic for Mutual Recursion and Local Variables 168 

0 

Invited Talk 3 

Explicit Substitutions and Programming Languages 181 



Session 5(a) 

Approximation Algorithms for Routing and Call Scheduling in 



All-Optical Chains and Rings 201 

0 p 

A Randomized Algorithm for Flow Shop Scheduling 213 



Session 5(b) 

Synthesizing Distributed Transition Systems from Global Specifications ... 219 

P. . T j 

Beyond Region Graphs: Symbolic Forward Analysis of Timed Automata . . 232 
p op Po 

Session 6 



Implicit Temporal Query Languages: Towards Completeness 245 

0 0 o 

On the Undecidability of Some Sub-classical First-Order Logics 258 

0 



Invited Talk 4 



How to Compute with DNA 
K 



00 



0 o 



269 




Table of Contents 



XI 



Session 7(a) 

A High Girth Graph Gonstruction and a Lower Bound for 

Hitting Set Size for Gombinatorial Rectangles 283 

Protecting Facets in Layered Manufacturing 291 

5 0 0 

i 

Session 7(b) 

The Receptive Distributed 7r-Galculus (Extended Abstract) 304 

0 0. O 00 0 

Series and Parallel Operations on Pomsets 316 

0 0 

Session 8 

Unreliable Failure Detectors with Limited Scope Accuracy and 

an Application to Gonsensus 329 

0 0 0 

Invited Talk 5 

Graph Isomorphism: Its Gomplexity and Algorithms 341 

o To 

Session 9(a) 

Gomputing with Restricted Nondeterminism: The Dependence of 

the OBDD Size on the Number of Nondeterministic Variables 342 

o 

Lower Bounds for Linear Transformed OBDDs and FBDDs 

(Extended Abstract) 356 

Session 9(b) 

A Unifying Framework for Model Ghecking Labeled Kripke Structures, 

Modal Transition Systems, and Interval Transition Systems 369 

Graded Modalities and Resource Bisimulation 381 



0 o 



0 o 



o 




XII 



Table of Contents 



Session 10(a) 

The Non-recursive Power of Erroneous Computation 394 

o 

Analysis of Quantum Functions (Preliminary Version) 407 

To 0 

Session 10(b) 

On Sets Growing Continuously 420 

Model Checking Knowledge and Time in Systems with Perfect Recall 
(Extended Abstract) 432 

O 0.0 

FST&TCS ISAAC Joint Session Talks 

The Engineering of Some Bipartite Matching Programs 446 

K o 

Author Index 451 




Recent Developments in the Theory of 
Arrangements of Surfaces* 



Micha Sharir 

^ School of Mathematical Sciences, Tel Aviv University 
Tel Aviv 69978, Israel 

^ Courant Institute of Mathematical Sciences, New York University 
New York, NY 10012, USA 



Abstract. We review recent progress in the study of arrangements of 
surfaces in higher dimensions. This progress involves new and nearly tight 
bounds on the complexity of lower envelopes, single cells, zones, and other 
substructures in such arrangements, and the design of efficient algorithms 
(near optimal in the worst case) for constructing and manipulating these 
structures. We then present applications of the new results to a variety 
of problems in computational geometry and its applications, including 
motion planning, Voronoi diagrams, union of geometric objects, visibility, 
and geometric optimization. 



1 Introduction 

The combinatorial, algebraic, and topological analysis of arrangements of sur- 
faces in higher dimensions has become one of the most active areas of research in 
computational geometry during the past decade. In this paper we will review the 
recent progress in the study of combinatorial and algorithmic problems related 
to such arrangements. 

Given a set F of n surfaces in the arrangement A{F) that they induce is 
the decomposition of into maximal connected regions (cells) of dimensions 
0, 1, . . . , d, such that each region is contained in the intersection of a fixed subset 
of r and is disjoint from all the other surfaces. For example, the arrangement of 
a set L of lines in the plane consists of vertices (0-dimensional cells) which are 
the intersection points of the lines, edges (1-dimensional cells) which are rela- 
tively open intervals along the lines delimited by pairs of consecutive intersection 
points, and faces (2-dimensional cells) which are the connected components of 
TR^\[jL. 

Arrangements of lines and of hyperplanes have been studied earlier quite 
extensively. In fact, Edelsbrunner’s book from 1987 [36], one of the earliest text- 
books on computational geometry, is essentially devoted to the study of such 

* Work on this paper has been supported by NSF Grant CCR-97-32101, by a grant 
from the U.S.-Israeli Binational Science Foundation, by the ESPRIT IV LTR project 
No. 21957 (CGAL), and by the Hermann Minkowski-MINERVA Center for Geom- 
etry at Tel Aviv University. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 1—21, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



2 



Micha Sharir 



arrangements. However, it has recently been realized that the more general ar- 
rangements of (curved) surfaces are more significant and have a wider range of 
applications, because many geometric problems in diverse areas can be reduced 
to problems involving such arrangements (but not necessarily to arrangements 
of hyperplanes) . We will give here four such examples: 

Motion planning. Assume that we have a robot system B with d degrees of 
freedom, i.e., we can represent each placement of i? as a point in d-space. Suppose 
that the workspace of B is cluttered with obstacles, whose shapes and locations 
are known. For each combination of a geometric feature (vertex, edge, face) of 
an obstacle and a similar feature of B, define their contact surface as the set 
of all points in d-space that represent a placement of B in which contact is 
made between these specific features. Let Z be a point corresponding to a given 
initial free placement of B, in which it does not intersect any obstacle. Then 
the set of all free placements of B that can be reached from Z via a collision- 
free continuous motion will obviously correspond to the cell containing Z in the 
arrangement of the contact surfaces. Thus, the robot motion planning problem 
leads to the problem of computing a single cell in an arrangement of surfaces 
in higher dimensions. The combinatorial complexity of this cell, i.e., the total 
number of lower-dimensional faces appearing on its boundary, serves as a trivial 
lower bound for the running time of the motion planning problem (assuming the 
entire cell has to be output). It turns out that in most instances this bound can 
be almost matched by suitable algorithms. 

Generalized Voronoi diagrams. Let S' be a set of n ‘simply-shaped’^ pairwise- 
disjoint compact convex objects in d-space, and let p be some metric. The 
Voronoi diagram Vorp(S) of S under the metric p is defined, as usual, as the 
decomposition of d-space into Voronoi cells I^(s), for s S S, where 

F(s) = {x S I p(x, s) < p(x, s') for all s' S S}. 

For each s G S define a function /s(x) = p(x, s), for x G (where p(x, s) = 
min^gs p(x, ?/)). Define F{x) = minsgs/s(x), for x G IR'^; we refer to F as the 
lower envelope of the /s’s. If we project the graph of F onto IR'’*, we obtain a 
decomposition of IR^^ into connected cells of dimensions 0, . . . , d, so that over each 
cell F is attained by a fixed set of functions. This decomposition is nothing but 
Vorp(5'), as follows directly by definition, and as has already been noted in [39]. 
In other words, the study of Voronoi diagrams in d dimensions is equivalent 
to the study of the lower envelope of a collection of surfaces in IR'^"'’^. Planar 
Voronoi diagrams have been studied intensively (see [20,36,60,68]), but very little 
is known about generalized Voronoi diagrams in higher dimensions. 

^ Formally, this means that each object in S is dehned as a Boolean combination of a 
constant number of polynomial equalities and inequalities of constant maximum de- 
gree in a constant number of variables; we also refer to such objects as semialgebraic 
sets of constant description complexity. 



Recent Developments in the Theory of Arrangements of Surfaces 



3 



Transversals. Let S' be a set of ‘simply-shaped’ compact convex objects in M"*. 
A hyperplane h is called a transversal of S if h intersects every member of S. 
Let T(S) denote the space of all hyperplane transversals of S. We wish to study 
the structure of T{S). To facilitate this study, we apply the geometric duality 
described, e.g., in [36], which maps hyperplanes to points and vice versa, and 
which preserves incidences and the above/below relationship between points and 
hyperplanes. Specifically, this duality maps a point (ai, . . . , ad) to the hyperplane 

Xd = aiSiH \-ad-iXd-i + ad, and ahyperplane Xd = biXi~\ \-bd-iXd-i+bd 

to the point (— 61 , . . . , — 6 ^- 1 , bd). Let h be the hyperplane Xd = aiXi -I- • • • -I- 
Qd-iXd-i + ad, and suppose that h intersects an object s G S. Translate h up 
and down until it becomes tangent to s. Denote the resulting upper and lower 
tangent hyperplanes by 



Xd — aiXi -h • ■ • -h Qd-iXd-i + Us{ai, . . . , ad-i) 



and 



Xd — aiXi -h • ■ • -h ad-iXd-i + Ls(ai , . . . , ad-i), 



respectively. Then we have 



Ls{ai, , ad-i) <ad<Us{ai,..., ad-i). 



Now if h is a tranversal of S, we must have 



max Ls{ai, . . . , 0 ^- 1 ) < ad < min Us{ai, . . . , ad-i). 
s£S s£S 

In other words, in dual space, T (S) is the region ‘sandwiched’ between a lower 
envelope of a collection of n surfaces and an upper envelope of another collection 
of n surfaces. 



Geometric optimization. Many problems in geometric optimization can be solved 
by reducing them to problems involving the analysis of substructures in arrange- 
ments. One such example is the problem of computing the width of a set S of n 
points in this is the smallest distance between two parallel planes enclos- 
ing S between them. This problem has been studied in a series of recent pa- 
pers [2,10,27], where more details can be found. Following standard techniques, 
it suffices to solve the decision problem where we are given a parameter w > 0 
and wish to determine whether the width of S' is < w. For this, we need to test 
all ‘antipodal’ pairs of edges of the convex hull C of S (these are pairs of edges 
that admit parallel supporting planes), and determine whether there exists such 
a pair whose supporting planes lie at distance < w. In the worst case the number 
of such pairs may be 6 >(n^), and the goal is to avoid having to test explicitly 
all such pairs. This is achieved by a step that partitions the set of all antipodal 
pairs of edges into a collection of complete bipartite graphs, so that, for each 
such graph E x E', it suffices to determine whether there exist e G E, e' G E' , 
such that the distance between the lines containing e and e', respectively, is < w. 
This problem can be solved by mapping the lines containing the edges in E' into 



4 



Micha Sharir 



points in 4-space (lines in three dimensions have four degrees of freedom), and 
by mapping the lines containing the edges of E into trivariate functions, so that 
the distance between a line £' in the first set and a line i in the second set is < w 
if and only if the point that represents E lies below the graph of the function 
that represents i. We have thus landed in a problem where we are given a collec- 
tion of points and a collection of surfaces in 4-space, and we need to determine 
whether there exists a point that lies below the upper envelope of the surfaces. 
Full details can be found in the papers cited above. 

As these examples show, many problems in combinatorial and computational 
geometry can be rephrased in terms of certain substructures (single cells, lower 
envelopes, regions enclosed between an upper envelope and a lower envelope, 
etc.) in arrangements of surfaces in higher dimensions. The past decade has 
seen a significant expansion in the study of arrangements. This work has been 
described in the recent book [75] and in several other surveys [45,51,52,73,74]. 

The main theme in the study of arrangements is to analyze the combinato- 
rial complexity of such substructures, with the goal of obtaining bounds on this 
complexity that are significantly smaller than the complexity of the full arrange- 
ment. A second theme is the design of efficient algorithms for constructing such 
substructures. 

In this study there are three main relevant parameters: the number n of sur- 
faces, their maximum algebraic degree 6, and the dimension d. The approach 
taken here is ‘combinatorial’, in which we want to calibrate the dependence of 
the complexity of the various structures and algorithms on the number n of sur- 
faces, assuming that the dimension and maximum degree, as well as any other 
factor that does not depend on n, is constant. In this way, all issues related to 
the algebraic complexity of the problem are ‘swept under the mg’. These issues 
should be (and indeed have been) picked up in the complementary ‘algebraic’ 
mode of research, where the dependence on the maximum degree b is more rele- 
vant; this area is known as computational real algebraic geometry; see [56,69,64] 
for studies of this kind. 

2 Complexity of Lower Envelopes 

During the past decade, significant progress has been made on the problem 
of bounding the complexity of the lower envelope of a collection of multivariate 
functions. This problem has been open since 1986, when it was shown in [54] that 
the combinatorial complexity of the lower envelope of n univariate continuous 
functions, each pair of which intersect in at most s points, is at most As(n), 
the maximum length of an {n,s)-Davenport-Schinzel sequence. This bound is 
slightly super-linear in n, for any fixed s (for example, it is 0(na(n)) for s = 3, 
where a(n) is the extremely slowly growing inverse of Ackermann’s function [54]; 
see also [12,75]). Since the complexity of the full arrangement of such a collection 
of functions can be 0(n^) in the worst case, this result shows that the worst-case 
complexity of the lower envelope is smaller than the overall complexity of the 
arrangement by nearly a factor of n. 



Recent Developments in the Theory of Arrangements of Surfaces 



5 



It was then conjectured that a similar phenomenon occurs in higher dimen- 
sions. That is, the combinatorial complexity of the lower envelope of a collec- 
tion T oi n ‘well-behaved’ d-variate functions should be close to 0{n'^) (as op- 
posed to which can be the complexity of the entire arrangement of 

the function graphs). More precisely, according to a stronger version of this 
conjecture, this quantity should be at most Xs{n)), for some constant s 

depending on the shape of the functions in 

This problem was almost completely settled in 1994 [49,72]: Let IF be a col- 
lection of (possibly partially-defined) d- variate functions, such that all functions 
in are algebraic of constant maximum degree and, in case of partial functions, 
the domain of definition of each function is a semi-algebraic set defined by a 
constant number of polynomial equalities and inequalities of constant maximum 
degree. (As already mentioned, we refer to such a region as having constant de- 
scription complexity.) Then, for any e > 0, the combinatorial complexity of the 
lower envelope of T is where the constant of proportionality depends 

on £, d, and on the maximum degree of the functions and of the polynomials 
defining their domains.^ Thus, the weak form of the above conjecture has been 
settled in the affirmative. 

Prior to the derivation of this bound, the above conjectures have been con- 
firmed only in some special cases, including the case in which the graphs of the 
functions are d-simplices in where a tight worst-case bound, 0(n'^a(n)), 

was established in [37,65]. (The case d = 1, involving n segments in the plane, 
where the bound is 0(na(n)), had been analyzed earlier, in [54,80].) There are 
also some even more special cases, like the case of hyperplanes, where the max- 
imum complexity of their lower envelope is known to be ), by the 

so-called Upper Bound Theorem for convex polytopes [62] . The case of balls also 
admits a much better bound, using a standard lifting transformation (see [36]). 

The new analysis technique (whose details can be found in [75]) has later 
been extended in various ways, to be described below. These additional results 
include near-optimal bounds on the complexity of a single cell in an arrangement 
of surfaces in and on the complexity of the region enclosed between two 
envelopes of bivariate functions, as well as to many other combinatorial results 
involving arrangements, as noted below, and efficient algorithms for constructing 
lower envelopes and single cells. The new results have been applied to obtain 
improved algorithmic and combinatorial bounds for a variety of problems; we 
will mention some of these applications later on. 



^ In this paper we will state similar complexity bounds that depend on e > 0, without 
saying repeatedly ‘for any e > O’; the meaning of such a bound is that it holds for 
any e > 0, where the constant of proportionality depends on e (and perhaps also 
on other problem constants), and usually tends to infinity when e tends to zero. For 
algorithmic complexity, the meaning of such a bound is that the algorithm can be 
fine-tuned, as a function of e, so that its complexity obeys the stated bound. 



6 



Micha Sharir 



3 Algorithms for Lower Envelopes 

Once the combinatorial complexity of lower envelopes of multivariate functions 
has been (more or less) resolved, the next task is to derive efficient algorithms 
for computing such lower envelopes. One of the strongest specifications of such 
a computation is as follows. We are given a collection T of d- variate algebraic 
functions satisfying the above conditions. We want to compute the lower en- 
velope Ejr and store it in some data structure, so that, given a query point 
p G we can efficiently compute the value ifjF(p), and the function(s) at- 
taining Ejr at p. (This is, for example, the version that is needed for solving 
the three-dimensional width problem described above.) Of course, we need to 
assume here an appropriate model of computation, where various primitive op- 
erations on a constant number of functions can be each performed in constant 
time. As mentioned above, the exact arithmetic model in computational real 
algebraic geometry is an appropriate choice. 

This task has recently been accomplished for the case of bivariate functions 
in several papers [7,22,23,33,72]. Some of these techniques use randomized algo- 
rithms, and their expected running time is which is comparable with 

the maximum complexity of such an envelope. The simplest algorithm is prob- 
ably the one given in [7]. It is deterministic and uses divide-and-conquer. That 
is, it partitions the set of functions into two subsets of roughly equal size, and 
computes recursively the lower envelopes Ei, E 2 of these two subsets. It then 
projects these envelopes onto the a;y-plane, to obtain two respective planar maps 
Ml, M 2 , and forms the overlay M of Mi and M 2 . Over each face / of M there 
are only two functions that can attain the final envelope (the function attain- 
ing El over / and the function attaining E 2 , so we compute the lower envelope 
of these two functions over /, and repeat this step for all faces of M. It is easy 
to see that the cost of this step is proportional to the number of faces of M. In 
general, overlaying two planar maps of complexity N each, can result in a map 
whose complexity is 0{N^), which may be in our case. Fortunately, it 

was shown in [7] that for lower envelopes, the overlay of Mi and M 2 has com- 
plexity 0(n^+^). In other words, the worst-case bound on the complexity of the 
overlay of (the xy-projections of) two lower envelopes of bivariate functions is 
asymptotically the same as the bound for the complexity of a single envelope. 
This implies that the complexity of the above divide-and-conquer algorithm is 
0(n^“''®). 

In higher dimensions, the only result known so far is that lower envelopes 
of trivariate functions satisfying the above properties can be computed, in the 
above strong sense, in randomized expected time 0(n^+®) [2]. For d > 3, it 
is also shown in [2] that all vertices, edges and 2-faces of the lower envelope 
of n d- variate functions, as above, can be computed in randomized expected 
time It is still an open problem whether such a lower envelope can 

be computed within similar time bounds in the above stronger sense. Another, 
more difficult problem is to devise output-sensitive algorithms, whose complexity 
depends on the actual combinatorial complexity of the envelope. This problem is 
really hard even for collections of nonintersecting triangles in 3-space. It would 



Recent Developments in the Theory of Arrangements of Surfaces 



7 



also be interesting to develop algorithms for certain special classes of functions, 
where better bounds are known for the complexity of the envelope, e.g., for 
envelopes of piecewise-linear functions (see below for more details). 



4 Single Cells 

Lower envelopes are closely related to other substructures in arrangements, no- 
tably single cells and zones. (The lower envelope is a portion of the boundary 
of the bottommost cell of the arrangement.) In two dimensions, it was shown 
in [46] that the complexity of a single face in an arrangement of n arcs, each 
pair of which intersect in at most s points, is 0 (As+ 2 (n)), and so is of the same 
asymptotic order of magnitude as the complexity of the lower envelope of such 
a collection of arcs. Again, the prevailing conjecture is that the same holds in 
higher dimensions. That is, the complexity of a single cell in an arrangement 
of n algebraic surfaces in d-space satisfying the above assumptions is close to 
0(n‘^“^), or, in a stronger form, this complexity should be 0(n'^“^As(n)), for 
some appropriate constant s. The weaker version of this conjecture has recently 
been confirmed in [21]; see also [50] for an earlier result for the three-dimensional 
case: Let A be an arrangement of n surface patches in all of them algebraic 
of constant description complexity. It was proved in [21,50] that, for any e > 0, 
the complexity of a single cell in A is 0(n‘^“^“''®), where the constant of propor- 
tionality depends on e, d, and on the maximum degree of the surfaces and of 
their boundaries. 

The results of [50] mentioned above easily imply that, for fairly general robot 
systems with d degrees of freedom, the complexity of the space of all free place- 
ments of the system, reachable from a given initial placement, is 
a significant improvement over the previous, naive bound 0{n‘^). In three di- 
mensions, the corresponding algorithmic problem, of devising an efficient (near- 
quadratic) algorithm for computing such a cell, has recently been solved in [70]. 
We will say more about this result when we discuss vertical decompositions be- 
low. Prior to this result, several other near-quadratic algorithms were proposed 
for some special classes of surfaces [7,17,47,48]. For example, the paper [48] 
gives a near-quadratic algorithm for the single cell problem in the special case 
of arrangements that arise in the motion planning problem for a (nonconvex) 
polygonal robot moving (translating and rotating) in a planar polygonal region. 
However, this algorithm exploits the special structure of the surfaces that arise 
in this case (namely, that any cross-section of such an arrangement at a fixed 
orientation of the polygon is polygonal), and does not extend to the general case. 
The algorithm given in [7] also provides a near-quadratic solution for the case 
where all the surfaces are graphs of totally-defined continuous algebraic bivariate 
functions (so that the cell in question is a;j/-monotone). 

In higher dimensions, we also mention the special case of a single cell in an 
arrangement of n (d— l)-simplices in H'^. It was shown in [17] that the complexity 
of such a cell is logn); a simplified proof was recently given in [77]. This 

bound is sharper than the general bound stated above; the best lower bound 



Micha Sharir 



known for this complexity is J7(n^ ^a(n)), so a small gap between the upper 
and lower bounds still remains. 

5 Zones 

Given an arrangement A of surfaces in and another surface (Tq, the zone 
of CTo is the collection of all cells of the arrangement A that ag crosses, and 
the complexity of the zone is the sum of complexities of all these cells. The 
‘classical’ Zone Theorem [36,40] asserts that the maximum complexity of the 
zone of a hyperplane in an arrangement of n hyperplanes in is 
where the constant of proportionality depends on d. This has been extended 
in [14] to the zone of an algebraic or convex surface (of any dimension p < d) 
in an arrangement of hyperplanes. The bound on the complexity of such a zone 
is log'^’n), and ) in the worst case, where c = 1 when 

d — pis odd and c = 0 when d — p is even. It is not clear whether the logarithmic 
factor is really needed, or that it is just an artifact of the proof technique. 

The result of [21,50] can easily be extended to obtain a bound of 
on the complexity of the zone of an algebraic surface ag (of constant descrip- 
tion complexity) in an arrangement of n algebraic surfaces in as above. 
Intuitively, the proof proceeds by cutting each of the given surfaces along its 
intersection with ag, and by shrinking the surface away from that intersection, 
thus leaving a ‘tiny’ gap there. These modifications transform the zone of a into 
a single cell in the arrangement of the new surfaces, and the result of [21] can 
then be applied. (The same technique has been used earlier in [38], to obtain 
a near-linear bound on the complexity of the zone of an arc in a 2-dimensional 
arrangement of arcs.) A similar technique implies that the complexity of the 
zone of an algebraic or convex surface in an arrangement of n (d — l)-simplices 
in R*^ is logn) [17,77]. 

6 Generalized Voronoi Diagrams 

One of the interesting applications of the new bounds on the complexity of lower 
envelopes is to generalized Voronoi diagrams in higher dimensions. Let S' be a 
set of n pairwise-disjoint convex objects in d-space, each of constant description 
complexity, and let p be some metric. The Voronoi diagram Vorp(S) of S under 
the metric p has been defined in the introduction. As shown there, the diagram 
is simply the minimization diagram of the lower envelope of n d- variate ‘distance 
functions’ induced by the objects of S. 

In the classical case, in which p is the Euclidean metric and the objects in S 
are singletons (points), one can replace these distance functions by a collection 
of n hyperplanes in so the maximum possible complexity of their lower 

envelope (and thus of Voi'p(5')) is (see, e.g., [36]). In more general 

settings, though, this reduction is not possible. Nevertheless, the new bounds 
on the complexity of lower envelopes imply that the complexity of the diagram 
is 0(n‘^+^). While this bound is nontrivial, it is conjectured to be too weak. 
For example, this bound is near-quadratic for planar Voronoi diagrams, but the 



Recent Developments in the Theory of Arrangements of Surfaces 



9 



complexity of practically any kind of planar Voronoi diagram is known to be 
only 0(n). 

In three dimensions, the above-mentioned bound for point sites and Euclidean 
metric is 0(n^). It has been a long-standing open problem whether a similar 
quadratic or near-quadratic bound holds in 3-space for more general objects 
and metrics (here the new results on lower envelopes give only an upper bound 
of Thus the problem stated above calls for improving this bound by 

roughly another factor of n. It thus appears to be a considerably more difficult 
problem than that of lower envelopes, and the only hope of making progress here 
is to exploit the special structure of the distance functions p{x, s). 

Fortunately, some progress on this problem was made recently. It was shown 
in [29] that the complexity of the Voronoi diagram is 0{n?a{n) log n), for the case 
where the objects of S are lines, and the metric p is a convex distance function 
induced by a convex polytope with a constant number of facets. (Note that the 
Li and Loo metrics are special cases of such distance functions. Note also that 
such a distance function is not necessarily a metric, because it will fail to be 
symmetric if the defining polytope is not centrally symmetric.) The best known 
lower bound for the complexity of the diagram in this special case is f2(ri^a{n)). 
In another recent paper [24], it is shown that the maximum complexity of the 
Li-Voronoi diagram of a set of n points in is 0{n^). Finally, it is shown 
in [78] that the complexity of the three-dimensional Voronoi diagram of point 
sites under general polyhedral convex distance functions is 0{n? logn). The most 
intriguing unsolved problem is to obtain a similar bound for a set S of n lines 
in space but under the Euclidean metric. 

An interesting special case of these problems involves dynamic Voronoi dia- 
grams for moving points in the plane. Let S' be a set of n points in the plane, 
each moving along some line at some fixed velocity. The goal is to bound the 
number of combinatorial changes of Vorp(S) over time. This dynamic Voronoi 
diagram can easily be transformed into a 3-dimensional Voronoi diagram, by 
adding the time t as a third coordinate. The points become lines in 3-space, 
and the metric is a distance function induced by a horizontal disc (that is, the 
distance from a point p{xQ,yo,to) to a line £ is the Euclidean distance from p 
to the point of intersection of t with the horizontal plane t = to)- Here too the 
open problem is to derive a near-quadratic bound on the complexity of the dia- 
gram. Cubic or near-cubic bounds are known for this problem, even under more 
general settings [42,44,72], but subcubic bounds are known only in some very 
special cases [28]. 

Next, consider the problem of bounding the complexity of generalized Voronoi 
diagrams in higher dimensions. As mentioned above, when the objects in S 
are n points in and the metric p is Euclidean, the complexity of Vorp(S') 
is 0(nl^‘^/^^). As d increases, this becomes drastically smaller than the naive 
bound or the improved bound, obtained by viewing the 

Voronoi diagram as a lower envelope in The same bound of 

has recently been obtained in [24] for the complexity of the Loo-diagram of n 
points in d-space (it was also shown that this bound is tight in the worst case). 



10 



Micha Sharir 



It is thus tempting to conjecture that the maximum complexity of generalized 
Voronoi diagrams in higher dimensions is close to this bound. Unfortunately, this 
was recently shown to be false in [15], where a lower bound of is given. 

The sites used in this construction are convex polytopes, and the distance is 
either Euclidean or a polyhedral convex distance function. For c? = 3, this lower 
bound does not contradict the conjecture made above, that the complexity of 
generalized Voronoi diagrams should be at most near-quadratic in this case. 
Also, in higher dimensions, the conjecture mentioned above is still not refuted 
when the sites are singleton points. Finally, for the general case, the construction 
of [15] still leaves a gap of roughly a factor of n between the known upper and 
lower bounds. 



7 Union of Geometric Objects 

A subproblem related to generalized Voronoi diagrams is as follows. Let S and p 
be as above (say, for the 3-dimensional case) . Let K denote the region consisting 
of all points x € whose smallest distance from a site in S is at most r, for 
some fixed parameter r > 0. Then K = ljj,g 5 B(s, r), where B(s, r) = {x € R^ | 
p{x, s) < r}. We thus face the problem of bounding the combinatorial complexity 
of the union of n objects in 3-space (of some special type). For example, if S' is a 
set of lines and p is the Euclidean distance, the objects are n congruent infinite 
cylinders in 3-space. In general, if the metric p is a distance function induced by 
some convex body P, the resulting objects are the Minkowski sums s 0 {—rP), 
for s € S, where A(BB = {x + y\ x€ A, y G B}. Of course, this problem can 
also be stated in any higher dimension. 

Since it has been conjectured that the complexity of the whole Voronoi dia- 
gram should be near-quadratic (in 3-space), the same conjecture should apply to 
the (simpler) structure K (whose boundary can be regarded as a ‘cross-section’ 
of the diagram at ‘height’ r; it does indeed correspond to the cross-section at 
height r of the lower envelope that represents the diagram) . Recently, this con- 
jecture has been confirmed in [18], in the special case where both P and the 
objects of S are convex polyhedra (see also [53] for an earlier study of a special 
case, with a slightly better bound). Let us discuss this result in more detail. An 
earlier paper [16] has studied the case involving the union of k arbitrary convex 
polyhedra in 3-space, with a total of n faces. It was shown there that the com- 
plexity of the union is 0(fc^ + nfclog^ fc), and can be + nka{k)) in the worst 
case. The upper bound was subsequently improved to 0(fc^+nfclog k) [19], which 
still leaves a small gap between the upper and lower bounds. In the subsequent 
paper [18], these bounds were improved in the special case where the polyhedra 
in question are Minkowski sums of the form Si®P, where the si’s, are k pairwise- 
disjoint convex polyhedra, P is a convex polyhedron, and the total number of 
faces of these Minkowski sums is n. The improved bounds are 0{nk\ogk) and 
fi{nka{k)) . They are indeed near-quadratic, as conjectured. 

Recently, the case where P is a ball (namely, the case of the Euclidean dis- 
tance) has been solved in [11]. It is shown there that the complexity of the 



Recent Developments in the Theory of Arrangements of Surfaces 



11 



union of the Minkowski sums of n pairwise-disjoint triangles in with a ball is 
for any e > 0. In the special case where instead of triangles we have n 
lines, we obtain that the complexity of the union of n infinite congruent cylinders 
in 3-space is 

In higher dimensions, it was recently shown in [24] that the maximum com- 
plexity of the union of n axis-parallel hypercubes in d-space is and 

this improves to when all the hypercubes have the same size. 

The above instances involve Minkowski sums of a collection of pairwise dis- 
joint convex objects with a fixed convex object. Of course, one may consider the 
union of arbitrary objects and look for special cases where improved combina- 
torial bounds can be established. For example, it is conjectured that the union 
of n arbitrarily-aligned cubes in 3-space has near-quadratic complexity. 



8 Vertical Decomposition 

In many algorithmic applications, one needs to decompose a d-dimensional ar- 
rangement, or certain portions thereof, into a small number of subcells, each 
having constant description complexity. In a typical setup where this problem 
arises, we need to process in a certain manner an arrangement of n surfaces in 
d-space. We choose a random sample of r of the surfaces, for some sufficiently 
large constant r, construct the arrangement of these r surfaces, and decompose 
it into subcells as above. Since no such subcell is crossed by any surface in the 
random sample, it follows by standard £-net theory [30,55,59] that with high 
probability, none of these subcells is crossed by more than 0(2- log r) of the n 
given surfaces. (For this result to hold, it is essential that each of these sub- 
cells have constant description complexity.) This allows us to break the problem 
into recursive subproblems, one for each of these subcells, solve each subproblem 
separately, and then combine their outputs to obtain a solution for the original 
problem. The efficiency of this method crucially depends on the number of sub- 
cells. The smaller this number is, the faster is the resulting algorithm. (We note 
that the construction of a ‘good’ sample of r surfaces can also be performed 
deterministically, e.g., using the techniques of Matousek [61].) 

The only general-purpose known technique for decomposing an arrangement 
of surfaces into subcells of constant description complexity is the vertical decom- 
position technique. In this method, we erect a vertical ‘wall’ up and down (in 
the Xd-direction) from each (d — 2)-dimensional face of the arrangement, and 
extend these walls until they hit another surface. This results in a decomposi- 
tion of the arrangement into subcells so that each subcell has a unique top facet 
and a unique bottom facet, and each vertical line cuts it in a connected (possi- 
bly empty) interval. We next project each resulting subcell on the hyperplane 
Xd = 0, and apply recursively the same technique within each resulting (d — 1)- 
dimensional projected cell, and then ‘lift’ this decomposition back into d-space, 
by extending each subcell c in the projection into the vertical cylinder c x H, and 
by cutting the original cell by these cylinders. We continue the recursion in this 
manner until we reach d = 1, and thereby obtain the vertical decomposition of 



12 



Micha Sharir 



the given arrangement. The resulting subcells have the desired properties. Fur- 
thermore, if we assume that the originally given surfaces are algebraic of constant 
maximum degree, then the resulting subcells are semi-algebraic and are defined 
by a constant number of polynomials of constant maximum degree (although 
the latter degree can grow quite fast with d). In what follows, we ignore the 
algebraic complexity of the subcells of the vertical decomposition, and will be 
mainly interested in bounding their number as a function of n, the number of 
given surfaces. 

It was shown in [26] that the number of cells in such a vertical decomposition 
of the entire arrangement is f3{n)), where /3(n) is a slowly growing func- 

tion of n (related to the inverse Ackermann’s function) . However, the only known 
lower bound is the trivial I7(n‘^), so there is a considerable gap here, for d > 3; 
for d = 3 the two bounds nearly coincide. Improving the upper bound appears 
to be a very hard task. This problem has been open since 1989; it seems difficult 
enough to preempt, at the present state of knowledge, any specific conjecture on 
the true maximum complexity of the vertical decomposition of arrangements in 
d > 3 dimensions. 

The bound stated above applies to the vertical decomposition of an entire 
arrangement of surfaces. In many applications, however, one is interested in the 
vertical decomposition of only a portion of the arrangement, e.g., a single cell, 
the region lying below the lower envelope of the given surfaces, the zone of some 
surface, a specific collection of cells of the arrangement, etc. Since, in general, 
the complexity of such a portion is known (or conjectured) to be smaller than 
the complexity of the entire arrangement, one would like to conjecture that a 
similar phenomenon applies to vertical decompositions. Recently, it was shown 
in [70] that the complexity of the vertical decomposition of a single cell in an 
arrangement of n surface patches in 3-space, as above, is As mentioned 

above, this leads to a near-quadratic algorithm for computing such a single 
cell, which implies that motion planning for fairly general systems with three 
degrees of freedom can be performed in near-quadratic time, thus settling a major 
open problem in the area. A similar near-quadratic bound has been obtained 
in [7] for the vertical decomposition of the region enclosed between a lower 
envelope and an upper envelope of bivariate functions. Another recent result [4] 
gives a bound on the complexity of the vertical decomposition of the first k 
levels in an arrangement of surfaces in 3-space, which is only slightly worse than 
the worst-case complexity of this undecomposed portion of the arrangement. A 
challenging open problem is to obtain improved bounds for the complexity of the 
vertical decomposition of the region lying below the lower envelope of n d-variate 
functions, for d> 3. 

Finally, an interesting special case is that of hyperplanes. For such arrange- 
ments, the vertical decomposition is a too cumbersome construct, because there 
are other easy methods for decomposing each cell into simplices, whose total 
number is O(n^). Still, it is probably a useful exercise to understand the complex- 
ity of the vertical decomposition of an arrangement of n hyperplanes in d-space. 
A recent result of [43] gives an almost tight bound of 0{n^ logn) for this problem 



Recent Developments in the Theory of Arrangements of Surfaces 



13 



in 4-space, but nothing significantly better than the general bound is known for 
d > 4. Another interesting special case is that of triangles in 3-space. This has 
been studied by [34,77], where almost tight bounds were obtained for the case of 
a single cell (0(n^ log^ n)), and for the entire arrangement {0{-n?a{n) \ogn + K), 
where K is the complexity of the undecomposed arrangement). The first bound 
is slightly better than the general bound of [70] mentioned above. The paper [77] 
also derives sharp complexity bounds for the vertical decomposition of many cells 
in such an arrangement, including the case of all nonconvex cells. 



9 Other Applications 

We conclude this survey by mentioning some additional applications of the new 
advances in the study of arrangements. We have already discussed in some detail 
the motion planning application, and have seen how the new results lead to 
a near-optimal algorithm for the general motion planning problem with three 
degrees of freedom. Here we discuss three other kinds of applications: to visibility 
problems in three dimensions, to geometric optimization, and to transversals. 

9.1 Visibility in Three Dimensions 

Let us consider a special case of the so-called aspect graph problem, which has 
recently attracted much attention, especially in the context of three-dimensional 
scene analysis and object recognition in computer vision. The aspect graph of 
a scene represents all topologically-different views of the scene. For background 
and a survey of recent research on aspect graphs, see [25] . Here we will show how 
the new complexity bounds for lower envelopes, with some additional machinery, 
can be used to derive near-tight bounds on the number of views of polyhedral 
terrains. 

Let K he a, polyhedral terrain in 3-space; that is, K is the graph of a con- 
tinuous piecewise-linear bivariate function, so it intersects each vertical line in 
exactly one point. Let n denote the number of edges of AT. A line i is said to 
lie over K if every point on ^ lies on or above K . Let Ck denote the space of 
all lines that lie over K. (Since lines in 3-space can be parametrized by four 
real parameters, we can regard Ck as a subset of 4-space.) Using an appropriate 
parametrization, the lower envelope of Ck consists of those lines in Ck that 
touch at least one edge of K. Assuming general position of the edges of AT, a 
line in Ck (or any line, for that matter) can touch at most four edges of AT. We 
estimate the combinatorial complexity of this lower envelope, in terms of the 
number of its vertices, namely those lines in Ck that touch four distinct edges 
of AT. It was shown in [49] that the number of vertices of Ck, as defined above, 

is 0(n^ ■ 2'^'^/'°®"), for some absolute positive constant c. 

We give here a sketch of the proof. We fix an edge Cq of AT, and bound 
the number of lines of Ck that touch eg and three other edges of AT, with the 
additional proviso that the three other contact points all lie on one fixed side 
of the vertical plane passing through eg- We then multiply this bound by the 



14 



Micha Sharir 



number n of edges, to obtain a bound on the overall number of vertices of Ck- We 
first rephrase this problem in terms of the lower envelope of a certain collection 
of surface patches in 3-space, one patch for each edge of K (other than eg), and 
then exploit the results on lower envelopes reviewed above. 

The space £eo of oriented lines that touch eg is 3-dimensional: each such line 
^ can be specified by a triple (t, fc, iC), where t is the point of contact with eg, and 
k = tan0, ( = — cot 4>, where (0, (j)) are the spherical coordinates of the direction 
of £, that is, 0 is the orientation of the xy-projection of £, and 4> is the angle 
between ^ and the positive z-axis. 

For each edge e yf eg of K, let (Te be the surface patch in £eo consisting of all 
points (t, fc, (^) representing lines that touch e and are oriented from eg to e. Note 
that if (f, k, <C) G (Te then C' > ^ iff the line (t, fc, CO passes below e. It thus follows 
that a line i in £eo is a vertex of the lower envelope of Ck if and only if is a 
vertex of the lower envelope of the surfaces (Te in the ffc^-space, where the height 
of a point is its C-coordinate. It is easy to show that these surfaces are algebraic 
of constant description complexity. Actually, it is easily seen that the number s of 
intersections of any triple of these surfaces is at most 2. The paper [49] studies the 
special case of lower envelopes of collections of such algebraic surface patches 
in 3-space, with the extra assumption that s = 2. It is shown there that the 
complexity of the lower envelope of such a collection is 0(n^ • 2“v^*°®”), for some 
absolute positive constant c, a bound that is slightly better than the general 
bound stated above. These arguments immediately complete the proof. (This 
bound has been independently obtained by Pellegrini [66] , using a different proof 
technique.) Recently, de Berg [32] has given a lower bound construction, in which 
the lower envelope of Ck has complexity f2(n^), implying that the upper bound 
stated above is almost tight in the worst case. 

We can extend the above result as follows. Let iF be a polyhedral terrain, 
as above. Let TZk denote the space of all rays in 3-space with the property that 
each point on such a ray lies on or above K. We define the lower envelope of TZk 
and its vertices in complete analogy to the case of Ck- By inspecting the proof 
sketched above, one easily verifies that it applies equally well to rays instead of 
lines. Hence we obtain that the number of vertices of TZk, as defined above, is 



constant c' > c, on the number of topologically-different orthographic views (i.e., 
views from infinity) of a polyhedral terrain K with n edges. We omit here details 
of this analysis, which can be found in [49]. The paper [35] gives a lower bound 
construction that produces f2{n^a{n)) topologically-different orthographic views 
of a polyhedral terrain, so the above bound is almost tight in the worst case. It is 
also instructive to note that, if K is an arbitrary polyhedral set in 3-space with n 
edges, then the maximum possible number of topologically-different orthographic 
views of K is 6>(n®) [67]. 

Consider next the extension of the above analysis to bound the number of 
perspective views of a terrain. As shown recently in [8], the problem can be 
reduced to the analysis of O(n^) lower envelopes of appropriate collections of 




We can apply this bound to obtain a bound of 0(n® • 2 




for some 



Recent Developments in the Theory of Arrangements of Surfaces 



15 



5-variate functions. This leads to an overall bound of for the num- 

ber of topologically-different perspective views of a polyhedral terrain with n 
edges. This bound is also known to be almost tight in the worst case, as follows 
from another lower bound construction given in [35]. Again, in contrast. If K 
is an arbitrary polyhedral set with n edges, the maximum possible number of 
topologically-different perspective views of K is 0(n®) [67]. 

9.2 Geometric Optimization 

In the past few years, many problems in geometric optimization have been at- 
tacked by techniques that reduce the problem to a problem involving arrange- 
ments of surfaces in higher dimensions. These reduced problems sometimes call 
for the construction of, and searching in lower envelopes or other substructures 
in such arrangements; one such example, that of computing the width in three 
dimensions, has been sketched in the introduction. Hence the area of geometric 
optimization is a natural extension, and a good application area, of the study 
of arrangements, as described above. See [9] for a recent survey on geometric 
optimization. 

One of the basic techniques for geometric optimization is the parametric 
searching technique, originally proposed by Megiddo [63]. Roughly speaking, the 
technique derives an efficient algorithm for the optimization problem from an 
effieicnt algorithm for the decision problem: determining whether the optimal 
value is at most (or at least) some specified value W. This technique has been 
used to solve a wide variety of geometric optimization problems, including many 
of those that involve arrangements. Some specific results of this kind include: 

— Selecting distances in the plane: Given a set S' of n points in and a 
parameter k < ( 2 ) , find the fc-th largest distance among the points of S [3] . 
Here the problem reduces to the construction and searching in 2-dimensional 
arrangements of congruent disks. 

~ The segment center problem: Given a set S of n points in H^, and a line 
segment e, find a placement of e that minimizes the largest distance from 
the points of S to e [41]. Using lower envelopes of certain special kinds of 
bivariate functions, and applying a more careful analysis, the problem can be 
solved in time, improving substantially a previous near-quadratic 

solution given in [5]. 

— Extremal polygon placement: Given a convex polygon P and a closed 
polygonal environment Q, find the largest similar copy of P that is fully 
contained in Q [76]. This is just an extension of the corresponding motion 
planning problem, where the size of P is fixed. The running time of the 
algorithm is almost the same as that of the motion planning algorithm given 
in [57,58]. 

— Width in three dimensions (see also the introduction): Gompute the 

width of a set S' of n points in this is the smallest distance between two 
parallel planes enclosing S between them. This problem has been studied in 
a series of papers [2,10,27], and the current best bound is [10]. 



16 



Micha Sharir 



The technique used in attacking this and the three following problems reduce 
them to problems involving lower envelopes in 4 dimensions, where we need 
to construct and to search in such an envelope. 

— Translational separation of two intersecting convex polytopes: Given 
two intersecting convex polytopes A and B, find the shortest translation of A 
that will make its interior disjoint from B. In case A= B, the solution is the 
width of A. Using a similar technique, this problem can be solved in time 

where m and n are the numbers of facets 

of A and B, respectively [6]. 

— Biggest stick in a simple polygon: Compute the longest line segment 

that can fit inside a given simple polygon with n edges. The current best 
solution is [10] (see also [2,13]). 

— Smallest-width annulus: Compute the annulus of smallest width that en- 
closes a given set of n points in the plane. Again, the current best solution is 

[10] (see also [2,13], and a recent approximation algorithm in [1]). 

— Geometric matching: Consider the problem where we are given two sets 
Si, S 2 of n points in the plane, and we wish to compute a minimum-weight 
matching in the complete bipartite graph Si x S 2 , where the weight of an 
edge (p, q) is the Euclidean distance between p and q. One can also consider 
the analogous nonbipartite version of the problem, which involves just one 
set S of 2n points, and the complete graph on S. The goal is to explore the 
underlying geometric structure of these graphs, to obtain faster algorithms 
than those available for general abstract graphs. 

It was shown in [79] that both the bipartite and the nonbipartite versions of 
the problem can be solved in time close to Recently, a fairly sophis- 

ticated application of vertical decomposition in 3-dimensional arrangements, 
given in [4], has improved the running time for the bipartite case to 0(n^+^). 

This list is by no means exhaustive. 

9.3 Transversals 

Let S' be a set of n compact convex sets in IR'^, each of constant description 
complexity. The space T (S) of all hyperplane transversals of S has been defined 
in the introduction. It was shown there that this space, in a dual setting, is 
the region in IR^ enclosed between the upper envelope of n surfaces and the 
lower envelope of n other surfaces, where each surface in the first (resp. second) 
collection represents the locus of the lower (resp. upper) tangent hyperplanes to 
an object of S. The results of [7] imply that the complexity of T(S') is 0(n^+®) 
in three dimensions. No similarly sharp bounds are known in higher dimensions. 
The results of [7] concerning the complexity of the vertical decomposition of such 
a ‘sandwiched’ region imply that T(S') can be constructed in 0(n^+^) time. 

The problem can be generalized by considering lower-dimensional transver- 
sals. For example, in three dimensions we can also consider the space of all line 
transversals of S. By mapping lines in IR^ into points in 4-space, and by using 
an appropriate parametrization of the lines, the space of all line transversals 



Recent Developments in the Theory of Arrangements of Surfaces 



17 



of S can also be represented as the region in enclosed between an upper 
envelope and a lower envelope of two respective collections of surfaces. Since no 
sharp bounds are known for the complexity of such a region in 4-space, the exact 
calibration of the complexity of the space of line transversals in 3-space is still 
an open problem. 



References 

1. P. Agarwal, B. Aronov, S. Har-Peled and M. Sharir, Approximation and exact 
algorithms for minimum- width annuli and shells, Proc. 15th ACM Symp. on Com- 
putational Ceometry (1999), 380-389. 16 

2. P.K. Agarwal, B. Aronov and M. Sharir, Computing envelopes in four dimensions 
with applications, SIAM J. Comput. 26 (1997), 1714-1732. 3, 6, 6, 15, 16, 16 

3. P.K. Agarwal, B. Aronov, M. Sharir and S. Suri, Selecting distances in the plane, 
Algorithmica 9 (1993), 495-514. 15 

4. P.K. Agarwal, A. Efrat and M. Sharir, Vertical decompositions of shallow levels 
in arrangements and their applications, Proc. 11th ACM Symp. on Computational 
Ceometry (1995), 39-50. Also to appear in SIAM J. Comput. 12, 16 

5. P.K. Agarwal, A. Efrat, M. Sharir and S. Toledo, Computing a segment-center for 
a planar point set, J. Algorithms 15 (1993), 314-323. 15 

6. P.K. Agarwal, L. Guibas, S. Har-Peled, A. Rabinovitch and M. Sharir, Computing 
exact and approximate shortest separating translations of convex polytopes in three 
dimensions, in preparation. 16 

7. P.K. Agarwal, O. Schwarzkopf and M. Sharir, The overlay of lower envelopes in 
3-space and its applications. Discrete Comput. Geom. 15 (1996), 1-13. 6, 6, 6, 7, 
7, 12, 16, 16 

8. P.K. Agarwal and M. Sharir, On the number of views of polyhedral terrains, Dis- 
crete Comput. Geom. 12 (1994), 177-182. 14 

9. P.K. Agarwal and M. Sharir, Efficient algorithms for geometric optimization, ACM 
Computing Surveys 30 (1998), 412-458. 15 

10. P.K. Agarwal and M. Sharir, Efficient randomized algorithms for some geometric 

optimization problems. Discrete Comput. Geom. 16 (1996), 317-337. 3, 15, 15, 

16, 16 

11. P. Agarwal and M. Sharir, Pipes, cigars and kreplach: The union of Minkowski 
sums in three dimensions, Proc. 15th ACM Symp. on Computational Geometry 
(1999), 143-153. 10 

12. P. Agarwal, M. Sharir and P. Shor, Sharp upper and lower bounds for the length 
of general Davenport Schinzel sequences, J. Combin. Theory, Ser. A 52 (1989), 
228-274. 4 

13. P.K. Agarwal, M. Sharir and S. Toledo, New applications of parametric searching 
in computational geometry. J. Algorithms 17 (1994), 292-318. 16, 16 

14. B. Aronov, M. Pellegrini and M. Sharir, On the zone of a surface in a hyperplane 
arrangement. Discrete Comput. Geom. 9 (1993), 177-186. 8 

15. B. Aronov, personal communication, 1995. 10, 10 

16. B. Aronov and M. Sharir, The union of convex polyhedra in three dimensions, Proc. 
34 th IEEE Symp. on Foundations of Computer Science (1993), 518-527. 10, 18 

17. B. Aronov and M. Sharir, Castles in the air revisited. Discrete Comput. Geom. 12 
(1994), 119-150. 7, 7, 8 



18 



Micha Sharir 



18. B. Aronov and M. Sharir, On translational motion planning of a convex polyhedron 
in 3-space, SIAM J. Comput. 26 (1997), 1785-1803. 10, 10 

19. B. Aronov, M. Sharir and B. Tagansky, The union of convex polyhedra in three 
dimensions, SIAM J. Comput. 26 (1997), 1670-1688 (a revised version of [16]). 10 

20. F. Aurenhammer, Voronoi diagrams — A survey of a fundamental geometric data 
structure, ACM Computing Surveys 23 (1991), 346-405. 2 

21. S. Basu, On the combinatorial and topological complexity of a single cell, Proc. 
39th Annu. IEEE Sympos. Found. Comput. Sci., 1998, 606-616. 7, 7, 8, 8 

22. J.D. Boissonnat and K. Dobrindt, Randomized construction of the upper envelope 
of triangles in IR®, Proc. 4th Canadian Conf. on Computational Geometry (1992), 
311-315. 6 

23. J.D. Boissonnat and K. Dobrindt, On-line randomized construction of the upper 
envelope of triangles and surface patches in IR®, Comp. Geom. Theory Appls. 5 
(1996), 303-320. 6 

24. J.D. Boissonnat, M. Sharir, B. Tagansky and M. Yvinec, Voronoi diagrams in 
higher dimensions under certain polyhedral distance functions. Discrete Comput. 
Geom. 19 (1998), 485-519. 9, 9, 11 

25. K.W. Bowyer and C.R. Dyer, Aspect graphs: An introduction and survey of recent 
results, Int. J. of Imaging Systems and Technology 2 (1990), 315-328. 13 

26. B. Chazelle, H. Edelsbrunner, L. Guibas and M. Sharir, A singly exponential strati- 
fication scheme for real semi-algebraic varieties and its applications, Proc. 16th Int. 
Colloq. on Automata, Languages and Programming (1989), 179-193. 12 

27. B. Chazelle, H. Edelsbrunner, L. Guibas and M. Sharir, Diameter, -width, closest 
line pair, and parametric searching. Discrete Comput. Geom. 10 (1993), 183-196. 
3, 15 

28. L.P. Chew, Near-quadratic bounds for the Li Voronoi diagram of moving points, 
Proc. 5th Canadian Conf. on Computational Geometry (1993), 364-369. 9 

29. L.P. Chew, K. Kedem, M. Sharir, B. Tagansky and E. Welzl, Voronoi diagrams of 
lines in three dimensions under polyhedral convex distance functions, J. Algorithms 
29 (1998), 238-255. 9 

30. K.L. Clarkson, New applications of random sampling in computational geometry. 
Discrete Comput. Geom. 2 (1987), 195-222. 11 

31. K.L. Clarkson and P.W. Shor, Applications of random sampling in computational 
geometry, II, Discrete Comput. Geom. 4 (1989), 387-421. 

32. M. de Berg, personal communication, 1993. 14 

33. M. de Berg, K. Dobrindt and O. Schwarzkopf, On lazy randomized incremental 
construction. Discrete Comput. Geom. 14 (1995), 261-286. 6 

34. M. de Berg, L. Guibas and D. Halperin, Vertical decomposition for triangles in 
3-space, Discrete Comput. Geom. 15 (1996), 35-61. 13 

35. M. de Berg, D. Halperin, M. Overmars and M. van Kreveld, Sparse arrangements 
and the number of views of polyhedral scenes, Intemat. J. Comput. Geom. Appl. 
7 (1997), 175-195. 14, 15 

36. H. Edelsbrunner, Algorithms in Combinatorial Geometry, Springer- Verlag, Heidel- 
berg 1987. 1, 2, 3, 5, 8, 8 

37. H. Edelsbrunner, The upper envelope of piecewise linear functions: Tight com- 
plexity bounds in higher dimensions. Discrete Comput. Geom. 4 (1989), 337-343. 
5 

38. H. Edelsbrunner, L. Guibas, J. Pach, R. Pollack, R. Seidel and M. Sharir, Arrange- 
ments of curves in the plane: topology, combinatorics, and algorithms, Theoret. 
Comput. Sci. 92 (1992), 319-336. 8 



Recent Developments in the Theory of Arrangements of Surfaces 



19 



39. H. Edelsbrunner and R. Seidel, Voronoi diagrams and arrangements, Discrete Corn- 
put. Geom. 1 (1986), 25-44. 2 

40. H. Edelsbrunner, R. Seidel and M. Sharir, On the zone theorem for hyperplane 
arrangements, SIAM J. Comput. 22 (1993), 418-429. 8 

41. A. Efrat and M. Sharir, A near-linear algorithm for the planar segment center 
problem. Discrete Comput. Geom. 16 (1996), 239-257. 15 

42. J.-J. Fu and R.C.T. Lee, Voronoi diagrams of moving points in the plane, Internat. 
J. Comput. Geom. Appl. 1 (1994), 23-32. 9 

43. L. Guibas, D. Halperin, J. Matousek and M. Sharir, On vertical decomposition 
of arrangements of hyperplanes in four dimensions. Discrete Comput. Geom. 14 
(1995), 113-122. 12 

44. L. Guibas, J. Mitchell and T. Roos, Voronoi diagrams of moving points in the 
plane, Proc. 1 7th Internat. Workshop Graph- Theoret. Concepts Computer Science, 
Lecture Notes in Comp. Sci., vol. 570, Springer- Verlag, pp. 113-125. 9 

45. L. Guibas and M. Sharir, Combinatorics and algorithms of arrangements, in New 
Trends in Discrete and Computational Geometry, (J. Pach, Ed.), Springer- Verlag, 
1993, 9-36. 4 

46. L. Guibas, M. Sharir and S. Sifrony, On the general motion planning problem with 
two degrees of freedom. Discrete Comput. Geom. 4 (1989), 491-521. 7 

47. D. Halperin, On the complexity of a single cell in certain arrangements of surfaces 
in 3-space, Discrete Comput. Geom. 11 (1994), 1-33. 7 

48. D. Halperin and M. Sharir, Near-quadratic bounds for the motion planning problem 
for a polygon in a polygonal environment. Discrete Comput. Geom. 16 (1996), 121- 
134. 7, 7 

49. D. Halperin and M. Sharir, New bounds for lower envelopes in three dimensions 
with applications to visibility of terrains. Discrete Comput. Geom. 12 (1994), 313- 
326. 5, 13, 14, 14 

50. D. Halperin and M. Sharir, Almost tight upper bounds for the single cell and zone 
problems in three dimensions. Discrete Comput. Geom. 14 (1995), 285-410. 7, 7, 
7, 8 

51. D. Halperin, Arrangements, in Handbook of Discrete and Computational Geometry 
(J.E. Goodman and J. O’Rourke, Editors), CRC Press, Boca Raton, PL, 1997, 
389-412. 4 

52. D. Halperin and M. Sharir, Arrangements and their applications in robotics: Re- 
cent developments, Proc. Workshop on Algorithmic Foundations of Robotics (K. 
Goldberg et ah. Editors), A. K. Peters, Boston, MA, 1995, 495-511. 4 

53. D. Halperin and C.-K. Yap, Complexity of translating a box in polyhedral 3-space, 
Proc. 9th Annu. ACM Sympos. Comput. Geom. 1993, 29-37. 10 

54. S. Hart and M. Sharir, Nonlinearity of Davenport-Schinzel sequences and of gen- 
eralized path compression schemes, Combinatorica 6 (1986), 151-177. 4, 4, 5 

55. D. Haussler and E. Welzl, e-nets and simplex range queries. Discrete Comput. 
Geom. 2 (1987), 127-151. 11 

56. J. Heintz, T. Recio and M.F. Roy, Algorithms in real algebraic geometry and 
applications to computational geometry, in Discrete and Computational Geometry: 
Papers from DIM ACS Special Year, (J. Goodman, R. Pollack, and W. Steiger, 
Eds.), American Mathematical Society, Providence, RI, 137-163. 4 

57. K. Kedem and M. Sharir, An efficient motion planning algorithm for a convex 
rigid polygonal object in 2-dimensional polygonal space. Discrete Comput. Geom. 
5 (1990), 43-75. 15 

58. K. Kedem, M. Sharir and S. Toledo, On critical orientations in the Kedem-Sharir 
motion planning algorithm. Discrete Comput. Geom. 17 (1997), 227-239. 15 



20 



Micha Sharir 



59. J. Komlos, J. Pach and G. Woeginger, Almost tight bound on epsilon-nets, Discrete 
Comput. Geom. 7 (1992), 163-173. 11 

60. D. Leven and M. Sharir, Intersection and proximity problems and Voronoi dia- 
grams, in Advances in Robotics, Vol. I, (J. Schwartz and C. Yap, Eds.), 1987, 
187-228. 2 

61. J. Matousek, Approximations and optimal geometric divide-and-conquer, J. Com- 
put. Syst. Sci. 50 (1995), 203-208. 11 

62. P. McMullen and G. C. Shephard, Convex Polytopes and the Upper Bound Con- 
jecture, Lecture Notes Ser. 3, Gambridge University Press, Cambridge, England, 
1971. 5 

63. N. Megiddo, Applying parallel computation algorithms in the design of serial al- 
gorithms, J. ACM 30, 852-865. 15 

64. B. Mishra, Computational real algebraic geometry, in Handbook of Discrete and 
Computational Geometry (J.E. Goodman and J. O’Rourke, Eds.), CRC Press LLC, 
Boca Raton, FL, 1997, 537-556. 4 

65. J. Pach and M. Sharir, The upper envelope of piecewise linear functions and the 
boundary of a region enclosed by convex plates: Combinatorial analysis. Discrete 
Comput. Geom. 4 (1989), 291-309. 5 

66. M. Pellegrini, On lines missing polyhedral sets in 3-space, Proe. 9th ACM Symp. 
on Computational Geometry (1993), 19-28. 14 

67. H. Plantinga and C. Dyer, Visibility, occlusion, and the aspect graph, Internat. J. 
Computer Vision, 5 (1990), 137-160. 14, 15 

68. F. Preparata and M. Shamos, Computational Gemetry: An Introduction, Springer- 
Verlag, Heidelberg, 1985. 2 

69. J.T. Schwartz and M. Sharir, On the Piano Movers’ problem: II. General techniques 
for computing topological properties of real algebraic manifolds. Advances in Appl. 
Math. 4 (1983), 298-351. 4 

70. O. Schwarzkopf and M. Sharir, Vertical decomposition of a single cell in a 3- 
dimensional arrangement of surfaces. Discrete Comput. Geom. 18 (1997), 269-288. 
7, 12, 13 

71. M. Sharir, On fc-sets in arrangements of curves and surfaces. Discrete Comput. 
Geom. 6 (1991), 593-613. 

72. M. Sharir, Almost tight upper bounds for lower envelopes in higher dimensions. 
Discrete Comput. Geom. 12 (1994), 327-345. 5, 6, 9 

73. M. Sharir, Arrangements in higher dimensions: Voronoi diagrams, motion plan- 
ning, and other applications, Proc. Workshop on Algorithms and Data Structures, 
Ottawa, Canada, August, 1995, Lecture Notes in Computer Science, Vol. 955, 
Springer- Verlag, 109-121. 4 

74. M. Sharir, Arrangements of surfaces in higher dimensions, in Advances in Discrete 
and Computational Geometry (Proc. 1996 AMS Mt. Holyoke Summer Research 
Conference, B. Chazelle, J.E. Goodman and R. Pollack, Eds.) Contemporary Math- 
ematics No. 223, American Mathematical Society, 1999, 335-353. 4 

75. M. Sharir and P.K. Agarwal, Davenport-Schinzel Sequences and Their Geometric 
Applications, Cambridge University Press, New York, 1995. 4, 4, 5 

76. M. Sharir and S. Toledo, Extremal polygon containment problems, Comput. Geom. 
Theory Appls. 4 (1994), 99-118. 15 

77. B. Tagansky, A new technique for analyzing substructures in arrangements. Dis- 
crete Comput. Geom. 16 (1996), 455-479. 7, 8, 13, 13 

78. B. Tagansky, The Complexity of Substructures in Arrangements of Surfaces, Ph.D. 
Dissertation, Tel Aviv University, July 1996. 9 



Recent Developments in the Theory of Arrangements of Surfaces 



21 



79. P.M. Vaidya, Geometry helps in matching, SIAM J. Comput. 18 (1989), 1201-1225. 
16 

80. A. Wiernik and M. Sharir, Planar realization of nonlinear Davenport-Schinzel 
sequences by segments. Discrete Comput. Geom. 3 (1988), 15-47. 5 



Dynamic Compressed Hyperoctrees with 
Application to the N-body Problem* 



Srinivas Aluru^ and Fatih E. Sevilgen^ 

^ Iowa State University, Ames, lA 50011, USA 
aluruOiastate . edu 

^ Syracuse University, Syracuse, NY 13244, USA 
sevilgenOecs . syr . edu 



Abstract. Hyperoctree is a popular data structure for organizing mul- 
tidimensional point data. The main drawback of this data structure is 
that its size and the run-time of operations supported by it are dependent 
upon the distribution of the points. Clarkson rectified the distribution- 
dependency in the size of hyperoctrees by introducing compressed hype- 
roctrees. He presents an 0(n log n) expected time randomized algorithm 
to construct a compressed hyperoctree. In this paper, we give three deter- 
ministic algorithms to construct a compressed hyperoctree in 0(n log n) 
time, for any fixed dimension d. We present 0(log n) algorithms for point 
and cubic region searches, point insertions and deletions. We propose a 
solution to the N-body problem in 0{n) time, given the tree. Our algo- 
rithms also reduce the run-time dependency on the number of dimen- 
sions. 



1 Introduction 

Hyperoctrees are used in a number of application areas such as computational 
geometry, computer graphics, databases and scientific computing. In this paper, 
we focus our attention on hyperoctrees for any fixed dimension d. These are 
popularly known as quadtrees in two dimensions and octrees in three dimensions. 
Hyperoctrees can be constructed by starting with a cubic region containing all 
the points and subdividing it into 2'^ equal sized cubic regions recursively until 
all the resulting regions have exactly one point. 

We term a data structure distribution-independent if its size is a function 
of the number of points only and is not dependent upon the distribution of 
the input points. We term an algorithm distribution-independent if its run-time 
is independent of the distribution of the points. Hyperoctrees are distribution- 
dependent because recursively subdividing a cubic region may result in just one 
occupied region for an arbitrarily large number of steps. In terms of the tree, 

* This research is supported in part by ARO under DAAG55-97- 1-0368, NSF CA- 
REER under CCR-9702991 and Sandia National Laboratories. The content of the 
information does not necessarily reflect the position or the policy of the U.S. federal 
government, and no official endorsement should be inferred. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 21—33, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



22 



Srinivas Aluru and Fatih E. Sevilgen 



there can be an arbitrarily long path without any branching. If the properties of 
interest to the problem at hand depend only on the points, all the nodes along 
such a path essentially contain the same information. Thus, one can compress 
such a path into a single node without losing any information. 

Such compressed hyperoctrees have been introduced by Clarkson [10] in the 
context of the all nearest neighbors problem. Clarkson presents a randomized 
algorithm to construct such a tree for n d-dimensional points in 0(c^n log n) ex- 
pected time (where c is a constant). Later, Bern [5] proposed an 0((cd)^n log n) 
time deterministic algorithm for construction and used centroid decomposi- 
tion [9] for O(dlogn) time point searches. Callahan and Kosaraju’s fair split 
tree achieves similar run-time complexities as the algorithms presented in this 
paper by allowing non-cubical regions [8]. Dynamizing these and similar data 
structures has been discussed in [4], [6], [8] and [15] by using auxiliary data 
structures, such as topology trees [13] or dynamic trees [16]. Further research on 
related techniques can be found in [2], [3], [11], [14] and [18]. 

In this paper, we present a dynamic variant of compressed hyperoctrees with- 
out using any of the aforementioned complicated auxiliary data structures. We 
call this new data structure Distribution- Independent Adaptive Tree (DIAT). 
We present three deterministic algorithms that each construct the DIAT tree 
in 0{dnlogn) time. We present algorithms for point and cubic region searches, 
point insertion and point deletion that run in O(dlogn) time. We assume an 
extended model of computation in which the floor, bitwise exclusive-or and log- 
arithm operations can be done in constant time. 

We expect that any algorithm that makes use of hyperoctrees can use DIAT 
trees to eliminate the dependence on distribution and result in faster running 
times. As evidence, we provide an algorithm to solve the N-body problem in 0{n) 
time (using the standard algebraic model of computation) on a given DIAT tree. 



2 The DIAT Tree 

We make use of the following terminology to describe DIAT trees: Call a cubic 
region containing all the points the root cell. Define a hierarchy of cells by the 
following: The root cell is in the hierarchy. If a cell is in the hierarchy, then the 2‘^ 
equal-sized cubic subregions obtained by bisecting along each coordinate of the 
cell are also called cells and belong to the hierarchy. Two cells are disjoint if they 
do not intersect at all or if they are merely adjacent, i.e., they intersect only at 
the boundaries. Given two cells in the hierarchy, possibly of different sizes, either 
one is completely contained in the other or they are disjoint. We use the term 
subcell to describe a cell that is completely contained in another. The 2‘^ subcells 
obtained by bisecting a cell along each dimension are also called the immediate 
subcells with respect to the bisected cell. Also, a cell is a supercell {immediate 
supercell) of any of its subcells (immediate subcells). We use the notation CCD 
to indicate that C is a subcell of D and C C D to indicate that C is a subcell 
of D and C is strictly smaller than D. Define the length of a cell C, denoted 
length{C), to be the span of C along any dimension. 



Dynamic Compressed Hyperoctrees with Application to the N-body Problem 



23 



The DIAT Tree is essentially a compressed hyperoctree with some pointers 
attached to it. Each node v in the DIAT tree contains two cells, large cell ofv and 
small cell ofv, denoted by L{v) and S{v) respectively. The large cell is the largest 
cell that the node is responsible for and the small cell is the smallest subcell of 
the large cell that contains all points within the large cell. Consider any internal 
node V in the DIAT tree. To obtain its children, we subdivide S{v) into 2‘^ equal- 
sized subcells. Note that subdivision of S{v) results in at least two non-empty 
subcells. For each non-empty subcell C resulting from the subdivision, there is 
a child u such that L{u) = C. Since each internal node has at least two children, 
the size of a DIAT tree for n points is 0(n). 

DIAT trees can be searched in a manner similar to binary search trees. Such 
algorithms are not efficient on DIAT trees because the height of a DIAT tree is 
0(n) in the worst-case. For speeding accesses and for efficient maintenance in 
the dynamic case, we equip the DIAT tree with pointers. 

Each internal node in the DIAT tree is pointed by a leaf. An internal node 
is pointed by the leftmost leaf in its subtree, unless its leftmost leaf is also the 
leftmost leaf in its parent’s subtree. In such a case, the node is pointed by the 
rightmost leaf in its subtree. The following lemma asserts that each leaf needs 
to store only one pointer. 

Lemma 1. Each leaf in a DIAT tree points to at most one internal node. 

Proof. A leaf node that is neither the leftmost nor the rightmost child of its 
parent does not point to any internal node. A leaf that is the leftmost child 
of its parent points only to its ancestor closest to the root for which this leaf 
is the leftmost leaf in the ancestor’s subtree. Now consider a leaf I that is the 
rightmost child of its parent. Suppose this leaf has two or more pointers. Consider 
any two of its pointers, and the nodes pointed by them, say u and v. Since I is 
the rightmost leaf in the subtrees of u and v, one must be an ancestor of the 
other. Without loss of generality, let u be the ancestor of v. Since v must be 
the rightmost child of its parent, it should be pointed by the leftmost leaf in 
its subtree and not the rightmost leaf 1. This contradicts our earlier assumption 
that I points to w. It follows that each leaf points to at most one internal node. 



2.1 Building the DIAT Tree 

In the subsequent algorithms to be presented for the construction and manipu- 
lation of DIAT trees, we make use of the following result by Clarkson [10]: 

Lemma 2. Let R be the product of d intervals R x I 2 x ... x Id, i.e., R is a 
hyperrectangular region in d dimensional space. The smallest cell containing R 
can be found in constant time. 

Proof. See [10]. The required smallest cell can be determined in 0{d) time, which 
is constant for any fixed dimension d. The procedure uses floor, logarithm and 
bitwise exclusive-or operations. 



24 



Srinivas Aluru and Fatih E. Sevilgen 



In what follows, we present three algorithms that construct the DIAT tree 
in 0 (n log n) time. While all three algorithms achieve the same time-bound, 
understanding the first two is necessary as the ideas developed in them are 
needed to make the tree dynamic. 

A Divide-and- Conquer Algorithm Let T\ and T2 be two DIAT trees rep- 
resenting two distinct sets and S2 of points. Let ri (resp., V2) be the root 
node of Ti (resp., T2). Suppose that L{ri) = L{r2), i.e., both Ti and T2 are 
constructed starting from a cell large enough to contain S'! U ^2 . A DIAT tree T 
for Si U S2 can be constructed in OdSi | -I- |-S'2|) time by merging Ti and T2. 

To merge Ti and T2, we start at their roots r\ and T2. Suppose that at 
some stage during the execution of the algorithm, we are at node v\ in T\ and 
at node V2 in T2 with the task of merging the subtrees rooted at vi and V2- 
An invariant of the merging algorithm is that L(vi) and L(v2) cannot be dis- 
joint. Furthermore, it can be asserted that L{vi) n L{v2) D S{vi) U S{v2)- For 
convenience, assume that a node may be empty. If a node has less than 2 '^ chil- 
dren, we may assume empty nodes in the place of absent children. The following 
possibilities arise: 

— Case I: v\ is an empty node. Return the subtree rooted at V2- 

— Case II: S{vi) = S{v2)- In this case, merge each child of v\ with the corre- 
sponding child of V2 and return the subtree rooted at V2- 

— Case III: S{vi) C S{v2)- There exists a child u of V2 such that 5 '(ni) C L{u). 
Merge v\ with u and return the subtree rooted at V2- 

— Case IV: S{vi) and S{v2) are disjoint. Create a new node v with L(v) = 
L{vi) n L{v2)- Set S{v) = smallest cell containing S'(ni) U S{v2)- Subdivide 
S(v) to separate S{vi) and S{v2) and create two corresponding children of v. 
Return the subtree rooted at v. 

Cases I and III admit symmetric cases with the roles of vi and V2 interchanged 
and can be handled similarly. The merging algorithm performs a preorder traver- 
sal of each tree. In every step of the merging algorithm, we advance on one of 
the trees after performing at most a constant amount of work. Thus, two DIAT 
trees with a common root cell can be merged in time proportional to the sum 
of their sizes. Using the merging algorithm and a recursive divide-and-conquer, 
a DIAT tree can be built in O(nlogn) time. 

Once the tree is built, it remains to assign the pointers from leaves to internal 
nodes. A recursive algorithm to assign all required pointers is given in Figure 1 . 
The procedure Assign- Pointers, when called with a node in the DIAT tree, 
assigns pointers to all internal nodes in the node’s subtree. If the node is the 
leftmost (resp., rightmost) child of its parent, it returns the leftmost (resp., 
rightmost) leaf in its subtree. The run-time of the algorithm is proportional to 
the size of the tree, which is 0 (n). 

Bottom-up Construction of the DIAT Tree Let / be any bijective function 
that maps the 2 '^ immediate subcells of a cell to the set { 1 , 2 , . . . , 2 ^}. Consider 



Dynamic Compressed Hyperoctrees with Application to the N-body Problem 



25 



Algorithm 1 Assign-Pointers (v) 

p = Assign-Pointers(leftmost child of v) 
q = Assign-Pointers(rightmost child of v) 
for every other child m of w 
Assign- Pointers(u) 

If V is the leftmost child of its parent 
q.pointer = v, return(p) 

If V is the rightmost child of its parent 
p. pointer = v, return(q) 

Else p.pointer = v, return(nil) 



Fig. 1. Algorithm for assigning pointers from leaves to internal nodes. 



an internal node v and let ui and U2 be two children of v. Define an ordering 
as follows: ui appears earlier in the ordering than U2 if and only if f(L(ui)) < 
f{L(u2))- If Ml appears earlier in the ordering than U2, then every node in the 
subtree of mi appears earlier in the ordering than any node in the subtree of U2- 
The ordering can be extended to sets of nodes and is a complete order for any 
set satisfying the property that no node in the set is an ancestor of another. In 
particular, the leaves of a DIAT tree (or the points) can be ordered. The left- 
to-right order of the corresponding leaves in the DIAT tree is the same as this 
ordering, if the children of each node are ordered according to the function /. 

To perform a bottom-up construction, first compute the ordering of the points 
as they should appear in the DIAT tree and compute the tree bottom-up using 
this order. Similar ordering has been used by Bern et. al. [4] as a way of con- 
structing hyperoctrees for points with integer coordinates. To order the points 
according to the DIAT tree, we establish a procedure that orders any two points: 
Given two points pi and p2 , compute the smallest subcell containing them. Sub- 
dividing this smallest subcell into its immediate subcells separates p\ and p2- 
The ordering of the two points is the same as the ordering of the immediate 
subcells they belong to and can be determined in 0(1) time by Lemma 2. 

Given n points and a root cell containing them, the points can be sorted in 
0(n log n) time according to the total order just defined by using any optimal 
sorting algorithm. The DIAT tree is then incrementally constructed using this 
sorted list of points starting from the single node tree for the first point and the 
root cell. During the insertion process, keep track of the most recently inserted 
leaf. Let p be the next point to be inserted. Starting from the most recently 
inserted leaf, traverse the path from the leaf to the root until we find the first 
node V such that p G L(v). Two possibilities arise: 



Gase I: p ^ S{v). Greate a new node u in place of v where S{u) is the smallest 
subcell containing p and S{v). Make v (along with its subtree) and the node 
containing p children of u. 



26 



Srinivas Aluru and Fatih E. Sevilgen 



— Case II: p € S{v). Consider the immediate subcell of S{v) that contains p. 
The DIAT tree presently does not contain a child for v that corresponds to 
this subcell. Therefore, the node containing p can be inserted as a child of v. 

Once the points are sorted, the rest of the algorithm is identical to a post- 
order walk on the final DIAT tree with 0(1) work per node. The time for a single 
insertion is not bounded by a constant but all n insertions together require 
only 0(n) time. Combined with the initial sorting of the points, the tree can 
be constructed in O(nlogn) time. The pointer assignment procedure described 
Algorithm 1 can be used to assign required pointers. 



Construction by Repeated Insertions An algorithm to insert a point in 
an n-point DIAT tree in 0(log n) time is presented later. Using this algorithm, 
a DIAT tree can be constructed by repeated insertions in 0(n log n) time. 

Theorem 1. A DIAT tree for n points can he constructed in 0(n log n) time. 

2.2 Querying DIAT Trees 

We consider two types of searches: point searches and cell searches. To facilitate 
fast algorithms, an auxiliary data structure is used in conjunction with the DIAT 
tree. The auxiliary data structure is a balanced binary search tree (abbreviated 
BEST) built on the input points using their order of appearance as leaves in 
the DIAT tree. Given the DIAT tree, the sorted order of points according to the 
DIAT tree can be read in 0(n) time followed by a binary search tree construc- 
tion on this sorted data taking additional 0{n) time. Each node in the BEST 
represents a point and contains a pointer to the leaf representing the same point 
in the DIAT tree. 

The general idea behind searches is as follows: The searches are always first 
conducted in the BEST which helps in locating the relevant leaves in the DIAT 
tree. The DIAT tree itself is then accessed from the leaves. 



Point Search To locate a point in the DIAT tree, first locate it in the BEST. 
If the point does not exist in the BEST, it does not exist in the DIAT tree. 
Otherwise, the node in the BEST has a pointer to the corresponding leaf in 
the DIAT tree. The search in the BEST is performed using the aforementioned 
ordering procedure. The overall search time is O(logn). 

Cell Search Given a cell C, the cell search problem is to locate the node in 
the DIAT tree representing C. A node v in the tree is said to represent C if 
S{v) Q C C L{v), i.e., the points in cell C are exactly the points in the subtree 
under v. If C does not contain any points, it is returned that the cell does not 
exist in the DIAT tree. 

Consider a given cell C and the bijective function / used in defining the 
ordering of immediate subcells of a cell. Identify the two immediate subcells Ci 



Dynamic Compressed Hyperoctrees with Application to the N-body Problem 



27 



and C2 of C such that /(Ci) < f{C) < /(C'2) for any immediate subcell C 
of C. Find the corner I of C\ (and h of C'2) that is not adjacent to any other 
immediate subcell of C. Locate the first point pi that should appear after I and 
the last point p2 that should appear before h in the BEST. It is clear that pi 
is the leftmost leaf in the subtree at the node representing C in the DIAT tree 
and p2 is the rightmost leaf. Since each node is pointed by either the leftmost 
or the rightmost leaf in its subtree, one of these leaves leads to the required 
node. Therefore, a cell search is equivalent to two point searches followed by 
0(1) work, for a total of O(logn) run-time. 

2.3 Dynamic Operations on DIAT Trees 

Point Insertion To insert a point q in the DIAT tree, we first insert it in 
the BEST and find its predecessor point p and successor point s. The smallest 
cell that contains p and q and the smallest cell that contains s and q are either 
the same or one is contained in the other. Let w be the node in the DIAT tree 
representing the smaller of these two cells. This can be located in O(logn) time 
using a cell search. If g G S{w), the immediate subcell of S{w) containing q 
does not contain any other point and q is inserted as a child of w. Otherwise, 
create and insert a new node x as the parent of w where S{x) is the smallest cell 
containing both q and S(w). The node representing q is then inserted as a child 
of X. 

During insertion, pointers need to be updated to be consistent with the 
pointer assignment mechanism in DIAT trees. For example, if the newly in- 
serted leaf is a leftmost child of its parent, it has to take over the pointer from 
the previous leftmost child in the parent’s subtree. For all possible cases that 
may arise during insertion, it can be shown that only a constant number of 
pointer updates are needed. The total run-time is bounded by O(logn). 



Point Deletion To delete a point p, first search for p in the BEST and identify 
the corresponding leaf in the DIAT tree. Delete p from the BEST and delete 
the leaf containing p from the DIAT tree. If the removal of the leaf leaves its 
parent with only one child, simply delete the parent and assign to the other 
child the largest cell it is responsible for. Since each internal node has at least 
two children, the delete operation can not propagate to higher levels in the 
DIAT tree. Like insertions, deletions also require a constant number of pointer 
adjustments. Deletions can be performed in O(logn) time. 

Theorem 2. Point search, cell search, point insertion and point deletion in a 
DIAT tree of n points can he performed in O(logn) time. 

3 The N-body Problem 

The N-body problem is defined as follows: Given n bodies and their positions, 
where each pair of bodies interact with a force inversely proportional to the 




28 



Srinivas Aluru and Fatih E. Sevilgen 



square of the distance between them, compute the force on each body due to all 
other bodies. A direct algorithm for computing all pairwise interactions requires 
0{n^) time. Greengard’s fast multipole method [12], which uses a hyperoctree 
data structure, reduces this complexity by approximating the interaction be- 
tween clusters of particles instead of computing individual interactions. For each 
cell in the hyperoctree, the algorithm computes a multipole expansion and a 
local expansion. The multipole expansion at a cell C, denoted 4>{C), is the effect 
of the particles within the cell C on distant points. The 4>’s are computed by a 
bottom-up traversal of the hyperoctree. The local expansion at a cell C, denoted 
ipiC), is the effect of all distant particles on the points within the cell C. The '0’s 
are computed by a top-down traversal (by combining the multipole expansions of 
well-separated cells). Though widely accepted to be 0(n), Greengard’s algorithm 
is distribution-dependent and the number of cell-cell interactions proportional 
to the size of the hyperoctree. 

Recently, Gallahan et. al. presented a distribution-independent algorithm for 
solving the N-body problem [7]. The algorithm computes a well-separated de- 
composition of the particles in 0{nlogn) time followed by computing the in- 
teractions in 0{n) time. In what follows, we present an 0{n) algorithm for the 
N-body problem given the DIAT tree. Though our run-time is the same, there 
are some important differences between the two algorithms: Almost all the N- 
body algorithms used by practitioners involve hyperoctrees. DIAT trees contain 
same type of cells. The regularity in the shapes and locations of the cells makes 
it easy to perform error calculations. 

The starting point of our algorithm is Greengard’s fast multipole method [12]. 
Greengard considers cells of the same length. Two such cells are called well- 
separated if the multipole expansion at one cell converges at any point in the 
other i.e., they are not adjacent. For the DIAT tree, a capability to deal with 
different cell lengths is needed. We generalize well-separatedness criteria for any 
two cells C and D which are not necessarily of the same length. Define a predicate 
well-separated{C, D) to be true if D’s multipole expansion converges at any 
point in C, and false otherwise. If two cells are not well-separated, they are 
proximate. Similarly, two nodes vi and V 2 in the DIAT tree are said to be well- 
separated if and only if S{vi) and S{v 2 ) are well-separated. Otherwise, we say 
that vi and V 2 are proximate. 

Our DIAT tree based N-body algorithm is as follows: For each node v in 
the DIAT tree, we wish to compute the multipole expansion 4>{v) and the local 
expansion 0(u). Both 0(f) and ip(v) are with respect to the cell S{v). The 
multipole expansions can be computed by a simple bottom-up traversal in 0{n) 
time. At a node v, 4>{v) is computed by aggregating the multipole expansions of 
the children of v. 

The algorithm to compute the local expansions is given in Figure 2. The 
computations are done using a top-down traversal of the tree. To compute local 
expansion at node v, we have to consider the set of nodes that are proximate 
to its parent, which is the proximity set, P{parent(v)). The proximity set of the 
root node contains only itself. We recursively decompose these nodes until each 



Dynamic Compressed Hyperoctrees with Application to the N-body Problem 



29 



Algorithm 2 Compute-Local-Exp ( v) 

I. Find the proximity set P(v) and interaction set I(v) for v 

E{v) = P{parent{v)) 

I(v) = 0; P{v) = 0 
While E{v) yf 0 do 

Pick some u € E{v) 

E{v) = E{v) - {«} 

If well-separated{S{v),S{u)) 

I{v) = I(v)U {m} 

Else if S{u) is smaller than S{v) 

P{v) = P{v) U {«} 

Else E{v) = E{v) U children(u) 

II. Calculate the local expansion at v 

Assign shifted ip{parent{v)) to tp{v) 

For each node u € I{v) 

Add shifted to tpiv) 

III. Calculate the local expansions at the children of v with recursive calls 
For each child u of u 

Compute-Local-Exp ( u) 



Fig. 2. Algorithm for calculating local expansions of all nodes in the tree rooted 
at V. 



node is either 1) well-separated from n or 2) proximate to v and the length of 
the small cell of the node is smaller than the small cell of v. The nodes satisfying 
the first condition form the interaction set of v, I(v) and the nodes satisfying 
the second condition are in the proximity set of v, P(v). In the algorithm, the 
set E{v) contains the nodes that are yet to be processed. As in Greengard’s 
algorithm, local expansions are computed by combining parent’s local expansion 
and the multipole expansions of the nodes in I{v). For the leaf nodes, potential 
calculation is completed by using the direct method. 

Before analyzing the run-time of this algorithm, we need to precisely de- 
fine the sets used in the algorithm. The set of cells that are proximate to the 
cell C and having same length as C is called the proximity set of C and is de- 
fined by P~{C) = {D I length{C) = length{D), ^well-separated{C,D)}. The 
superscript is used to indicate that cells of the same length are being con- 
sidered. For a node v in the DIAT tree, define the proximity set P(v) as the 
set of all nodes proximate to v and having the small cell smaller and large 
cell larger than S{v). More precisely, P{v) = {w | ^well-separated{S{v), S{w)), 
length{S{w)) < length{S{v)) < length{L{w))} . The interaction set I(v) of v is 
defined as I{v) = {w \ well-separated{S{v) , S{w)) , [ w € P{parent{v)) V {3u G 
P{parent{v)), w is & descendant of u, — ‘W ell- sep{v, parent {w)), length{S{v)) < 
length{S{parent{w)))Y\\ . We use parent{w) to denote the parent of the node w. 






30 



Srinivas Aluru and Fatih E. Sevilgen 



In the rest of the section, we prove that our algorithm achieves the running- 
time of 0{n) for any predicate well- separated that satisfies the following there 
conditions: 

Cl. The relation well- separated is symmetric for equal length cells, that is, 
length{C) = length{D) well-separated{C, D) = well-separated{D,C). 
C2. For any cell C, \P^{C)\ is bounded by a constant. 

C3. If two cells C and D are not well-separated, any two cells C" and D' such 
that C C C' and D <Z D' are not well-separated as well. 

These three conditions are respected by the various well-separatedness criteria 
used in N-body algorithms and in particular, Greengard’s algorithm. In N-body 
methods, the well-separatedness decision is solely based on the geometry of the 
cells and their relative distance and is oblivious to the number of particles or 
their distribution within the cells. Given two cells C and D of the same length, 
if D can be approximated with respect to C, then C can be approximated with 
respect to D as well, as stipulated by Condition Cl. The size of the proximity 
sets of cells of the same length should be 0(1) as prescribed by Condition C2 in 
order that an 0{n) algorithm is possible. Otherwise, we can construct an input 
that requires processing the proximity sets of 0(n) such cells, making an 0(n) 
algorithm impossible. Condition C3 merely states that two cells C and D' are 
not well-separated unless every subcell of C is well-separated from every subcell 
of D'. 

Lemma 3. For any node v in the DIAT tree, |T’(f)| = 0(1). 

Proof. Consider any node v. Each u G P{v) can be associated with a unique cell 
C G P^{S{v)) such that S{u) C C. This is because any subcell of C which is 
not a subcell of S{u) is not represented in the DIAT tree. It follows that |T’(?^)| 
< \P={S{v))\ = 0(1) (by Condition C2). 



Lemma 4. The sum of interaction set sizes over all nodes in the DIAT tree is 
linear in the number of nodes in the DIAT tree i.e., = 0(n). 

Proof. Let v he a, node in the DIAT tree. Consider any w G I{v), either w G 
P{parent{v)) or w is in the subtree rooted by a node u G P{parent(v)). Thus, 

Ein-)i = E |{ui I w G I{v),w G P{parent{v))}\ 

V V 

+ Ijw I w G I{v), w ^ P{parent{v))}\. 

V 

Consider these two summations separately. The bound for the first summa- 
tion is easy; From Lemma 3, \P{parent{v))\ = 0(1). So, 

|{w I w G I{v),w G P{parent{v))}\ = 0(1) = 0(n). 



Dynamic Compressed Hyperoctrees with Application to the N-body Problem 



31 



The second summation should be explored more carefully. 

|{w I w € I{v),w ^ P{parent{v))}\ = |{i; | w G I{v),w ^ P{parent{v))}\ 

V W 

In what follows, we bound the size of the set M{w) = {v \ w & I{v), w ^ 
P{parent{v))} for any node. 

Since w ^ P{parent{v)), there exists a node u G P{parent{v)) such that w 
is in the subtree rooted by u. Consider parent{w)\ The node parent{w) is ei- 
ther M or a node in the subtree rooted at u. In either case, length{S{parent(w))) 
< length{S{parent{v))). Thus, for each v € M{w), there exists a cell C such 
that S{v) C C C S{parent{v)) and length{S{parent{w))) = length{C). Fur- 
ther, since v and w are not well-separated, C and S{parent{w)) are not well- 
separated as well by Condition C3. That is to say S{parent{w)) G P^{C) and 
C G P^{S{parent{w))) by Condition Cl. By Condition C2, we know that 
\P^{S{parent{w)))\ = 0(1). Moreover, for each cell C G P^{S{parent{w))), 
there are at most 2'^ choices of v because length{C) < length{S{parent{v))). 
As a result, |M(?ii)| < 2^^ x 0(1) = 0(1) for any node w. Thus, J2v l^(^)l = 
E™ I w e I{v), w i P{parent{v))}\ = 0(1) = 0(n). 



Theorem 3. Given a DIAT tree for n particles, the N-body problem can be 
solved in 0{n) time. 

Proof. Computing the multipole expansion at a node takes constant time and 
the number of nodes in the DIAT tree is 0(n). Thus, total time required for the 
multipole expansion calculation is 0(n). The nodes explored during the local 
expansion calculation at a node v are either in P{v) or I{v). In both cases, 
it takes constant time to process a node. By Lemma 3 and 4, the total size 
of both sets for all nodes in the DIAT tree is bounded by 0{n). Thus, local 
expansion calculation takes 0(n) time. As a conclusion, the running time of the 
fast multipole algorithm on the DIAT tree takes 0{n) time irrespective of the 
distribution of the particles. 

4 Conclusions 

In this paper we presented the DIAT tree, a new dynamic data structure for 
multidimensional point data. We presented construction algorithms for DIAT 
trees in 0(n log n) time and search and insertion/deletion algorithms in O(logn) 
time. 

DIAT trees can potentially be used to solve any application that currently 
uses hyperoctrees. We have presented an optimal algorithm for the N-body prob- 
lem using DIAT trees. The DIAT trees and algorithms presented improves or 
matches the results presented in [4], [10] and [15] and provide dynamic handling 
of the data points. For example, the randomized algorithm for compressed hyper- 
octree construction in [10] can be replaced by a DIAT tree construction algorithm 



32 



Srinivas Aluru and Fatih E. Sevilgen 



presented in this paper to achieve O(nlogn) deterministic algorithm for all near- 
est neighbors problem. Many other applications use hyperoctrees and we believe 
that faster sequential and parallel algorithms can be designed for them using 
DIAT trees. We recently developed parallel algorithms for constructing DIAT 
trees and for solving the N-body problem using them. Both the algorithms are 
independent of the distribution of points and have rigorously analyzable worst 
case running times. 



Acknowledgments 

The authors gratefully acknowledge valuable suggestions made by the reviewers 
which led to improvements in the presentation of the paper. 



References 

1. Aluru, S.: Greengard’s N-body algorithm is not order N. SIAM Journal on Scientific 
Computing 17 (1996) 773-776. 

2. Arya, S., Mount, D., Netanyahu, N., Silverman, R., Wu, A.Y.: An optimal algo- 
rithm for approximate nearest neighbor searching. Proc. ACM-SIAM Symposium 
on Discrete Algorithms (1994) 573-582. 22 

3. Arya, S., Mount, D., Netanyahu, N., Silverman, R., Wu, A.Y.: An optimal algo- 
rithm for approximate nearest neighbor searching in fixed dimensions. Journal of 
the ACM 45 (1998) 891-923. 22 

4. Bern, M., Eppstein, D., Teng, S.H.: Parallel construction of quadtrees and quality 
triangulations. Proc. Workshop on Algorithms and Data Structures (1993) 188- 
199. 22, 25, 31 

5. Bern, M.: Approximate closest-point queries in high dimensions. Information Pro- 
cessing Letters 45 (1993) 95-99. 22 

6. Bespamyatnikh, S.N.: An optimal algorithm for closest-pair maintenance. Discrete 
Comput. Geom. 19 (1998) 175-195. 22 

7. Callahan, P.B., Kosaraju, S.R.: A decomposition of multidimensional point sets 
with applications to fc-nearest neighbors and N-body potential fields. Journal of 
the ACM 42 (1995) 67-90. 28 

8. Callahan, P.B., Kosaraju, S.R.: Algorithms for dynamic closest pair and n-body 
potential fields. Proc. ACM-SIAM Symposium on Discrete Algorithms (1995) 263- 
272. 22, 22 

9. Chazelle, B.: A theorem on polygon cutting with applications. Proc. Foundations 
of Computer Science (1982) 339-349. 22 

10. Clarkson, K.L.: Fast algorithms for the All-Nearest-Neighbors problem. Proc. 
Foundations of Computer Science (1983) 226-232. 22, 23, 23, 31, 31 

11. Cohen, R.F., Tamassia, R.: Combine and conquer. Algorithmica 18 (1997) 51-73. 
22 

12. Greengard, L., Rokhlin, V.: A fast algorithm for particle simulations. Journal of 
Computational Physics 73 (1987) 325-348. 28, 28 

13. Frederickson, G.N.: A data structure for dynamically maintaining rooted trees. 
Proc. ACM-SIAM Symposium on Discrete Algorithms (1993) 175-194. 22 

14. Mitchell, J.S.B., Mount, D.M., Suri, S.: Query-Sensitive ray shooting. International 
Journal of Computational Geometry and Applications 7 (1997) 317-347. 22 



Dynamic Compressed Hyperoctrees with Application to the N-body Problem 



33 



15. Schwarz, C., Smid, M., Snoeyink, J.: An optimal algorithm for the on-line closest- 
pair problem. Algorithmica 12 (1994) 18-29. 22, 31 

16. Sleator, D.D., Tarjan, R.E.: A data structure for dynamic trees. Journal of Com- 
puter and System Sciences 26 (1983) 362-391. 22 

17. Teng, S.H.: Provably good partitioning and load balancing algorithms for parallel 
adaptive N-body simulations. SIAM Journal on Scientific Computing 19 (1998) 
635-656. 

18. Vaidya, P.M.: An 0(n log n) algorithm for the All-Nearest-Neighbors problem. Dis- 
crete Computational Geometry 4 (1989) 101-115. 22 



Largest Empty Rectangle among a Point Set 



Jeet Chaudhuri^ and Subhas C. Nandy^* 

^ Wipro Limited, Technology Solns, Bangalore 560034, India 
^ Indian Statistical Institute, Calcutta - 700 035, India 



Abstract. This paper generalizes the classical MER problem in 2D. 
Given a set P of n points, here a maximal empty rectangle (MER) is 
defined as a rectangle of arbitrary orientation such that each of its four 
sides coincides with at least one member of P and the interior of the 
rectangle is empty. We propose a simple algorithm based on standard 
data structure to locate largest area MER on the floor. The time and 
space complexities of our algorithm are 0{n^) and 0{n^) respectively. 



1 Introduction 

Recognition of all maximal empty axes-parallel (isothetic) rectangles, commonly 
known as MER problem was first introduced in [3]. Here a set of points, say 
P = {pi,P 2 t ■ ■ ,Pn} is distributed on a rectangular floor. The objective is to 
locate all possible isothetic maximal empty rectangles (MER) . They proposed an 
algorithm for this problem with time complexity 0{min{n'^ , Rlogn)), where R 
denoting the number of reported MERs’, may be 0{n^) in the worst case. The 
time complexity was later improved to 0{R + nlogn) [6]. The best result for 
locating the largest empty isothetic rectangle among a point set without inspect- 
ing all MER’s appeared in [1]; it uses divide and conquer and matrix searching 
techniques and its time complexity is 0{nlog‘^n). The MER problem is later gen- 
eralized among a set of isothetic obstacles [4] , and among a set of non-isothetic 
obstacles [5]. 

In this context, it needs to mention that for a given set of points P, the 
location of all/largest empty r-gon whose vertices coincide with the members 
in P, can be reported in 0(73 (P) -I- r'jr{P)) time if r > 5; for r = 3 and 4, it 
requires 0(7r(P)) time [2], where ^r{P) is the number of such empty r-gons. It 
is also shown that 74(P) > 7s(P) — ("2^)’ expected value of 7s(P) is 

0{n^). This provides a lower bound on the number of empty convex quadrilateral 
in terms of number of empty triangles. 

This paper outlines a natural generalization of the classical MER problem. 
Given n points on a 2D plane, a long standing open problem is to locate an 
empty rectangle of maximum area. Thus the earlier restriction of the isotheticity 
of the MER is relaxed. This type of problem often arises in different industrial 
applications where one needs to cut a largest defect-free rectangular piece from a 

* This work was done when the author was visiting School of Information Science, 
Japan Advanced Institute of Science and Technology, Ishikawa 923-1292, Japan. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 34-46, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Largest Empty Rectangle among a Point Set 



35 



given metal sheet. We adopt a new algorithmic paradigm, called grid rotation, to 
solve this problem. The time and space complexities of our algorithm are 0{n^) 
and 0{n^) respectively. 

2 Basic Concepts 



Let P = {pi,p 2 , ■ ■ ■ ,Pn} be a set of n arbitrarily distributed points on a 2D 
region. Without loss of generality, we may choose the origin of the coordinate 
system such that all the points lie in the first quadrant. The coordinate of point pi 
will be denoted by (xi,yi). We shall define a maximal empty rectangle (MER) 
by an ordered tuple as follows : 

Definition : A rectangle (of arbitrary orientation) on the floor is called an 
MER if it is empty, i.e., not containing any member of P, and no other empty 
rectangle can properly enclose it. Thus, each of the four boundaries of an MER 
must contain at least one point of P. 

The notion of a rectangle assumes that this geometric structure is bounded 
on all sides. But we take the liberty to loosen this notion to include also those 
rectangles as MERs that is unbounded on some side, provided, of course, they are 
empty and cannot be properly contained in some other rectangle. The rectangles 
and MERs in the conventional sense shall then be referred to specifically as 
” bounded” . 




Fig. 1. Definition of an MER 



It needs to mention here that, by slightly rotating an empty rectangle with 
four specified points {pi,Pj,Pk,Pi} on its four boundaries, one may get multiple 
MERs’ with the same set of bounding points. (The pictorial representation of 
the above situations, is dropped due to the space limitation.) This motivates us 
to define a PMER as follows : 

Definition : Consider a set of MERs’ with {pi,Pj,Pk,Pi} on its four bound- 
aries and their south boundary makes an angle between (j) and ip with the positive 
direction of the a:-axis. An MER in this set is said to be prime MER (PMER) if 





36 



Jeet Chaudhuri and Subhas C. Nandy 



there exists no MER in this set which is of larger size than it. A set of MERs, 
corresponding to a PMER, is represented by a six tuple {pi,pj,pk,pe,4’j'^}- 

In order to assign some order among the points in four sides of an MER, 
consider a corner of the MER having maximum y-coordinate. The side (bound- 
ary) of the MER incident to that corner and having non-negative slope will be 
referred to as its north boundary. The other boundaries viz. the east, south and 
west boundaries are defined based on the north boundary (see Figure 1). Actu- 
ally, this type of nomenclature is misnomer in the context of non-axis parallel 
rectangles, but it will help to explain our method. Our objective is to locate the 
largest MER whose each of the four sides is bounded by some point(s) of P. 

The algorithmic technique, discussed in this paper, shows the number of 
PMERs may be O(n^) in the worst case. Let us mention, once and for all, that 
henceforth, we shall refer to the terms ’MER’ and ’rectangle’ interchangeably. 
Further we may refer north, west, south and east as top, left, bottom and right 
respectively. 



3 Identification of PMERs 

In this section, we explain the recognition of all PMERs using a very simple 
algorithmic technique, called grid rotation. Initially, we draw n horizontal lines 
and n vertical lines through all the members in P. The resulting diagram is a 
grid, but the separation among each pair of vertical (horizontal) lines are not 
same. For a given P, the initial grid diagram is shown in Figure 2a. During 
execution of the algorithm these lines will be rotated and will no longer remain 
horizontal/ vertical. So, we shall refer to the lines which are initially horizontal 
(vertical), as red lines {blue lines). At any instant of time during the execution 
of algorithm, the angle 9 made by each of the red lines with the a;-axis, will be 
referred to as the grid angle. 

Consider the set of MERs which are embedded in the grid, i.e., the set of 
MERs whose sides are incident to the grid lines. We maintain these rectangles 
in a data structure, called grid diagram, as follows. 

3.1 Grid Diagram 

First we aim to represent the grid in Figure 2a using &nn+ly.n+l matrix M., 
where n = |P|. The row numbers 0 to n of this matrix increase from bottom to 
top while the column numbers 0 to n from left to right. Let Py = {p'i,P 2 , ■ ■ - p'n} 
be the ordered set of points in P in increasing values of their y-coordinates, 
and Px = {p'i,P 2 , ■ ■ - Pn} b® the ordered set of points in P in increasing values 
of their a;-coordinates. Let p G P such that p = p'f. and p = p". Then we put 
Ai{k,j) = 1. The other entries in Ai are initialized to 0. Thus each red (blue) 
line of the grid is mapped to each row (column) of the matrix AA, each point 
in P is mapped to an 1 entry in A4. 

Our next step will be to represent the embedded rectangles. First we consider 
only those that are bounded on all sides by some point in P. Let us consider a 




Largest Empty Rectangle among a Point Set 



37 



rectangle in the grid as in Figure 2a bounded by {pn,Pw,Ps,Pe}{= {b,a,g,d}) 
at its north, west, south and east sides respectively. Let p„ = Pa„ and 
or in other words, p„ corresponds to the a„-th row and the /3n-th column of the 
matrix At. Similarly, the row (column) indices corresponding to Pw, Ps and Pe 
are as and «e {Pw, Ps and Pe) respectively. We represent this MER by (i) 
storing the point pair (pw,Pe) in the (o;s,/3„)-th entry of the matrix PA, which 
we shall henceforth term as storing at the south boundary, and also (ii) storing 
the point pair {pn,Ps) in the {ae, Pw)Ah. entry of the same matrix, which we 
shall describe as storing at the west boundary. 





Fig. 2. Demonstration of grid rotation technique using grid diagram 



Next we aim to store rectangles that are unbounded in one side. An MER that 
is unbounded in the north, will have p„ being — oo while storing it at the west, 
and is stored at the south in M{as,Ps), i.e., the entry of M representing the 
point which bounds it at the south. An MER unbounded in the south, likewise, 
will have Ps being oo while storing it at the west, and is assumed to be bounded 
by the 0-th row in the matrix in the south. Thus, we store it at the south at the 
entry in the 0-th row and column corresponding to the north bounding point. 
The case is absolutely symmetrical for MERs unbounded at either west or east 
but not both. 

Next we tackle the case where MERs that are unbounded in exactly two 
sides. If a rectangle is unbounded in the north and south, we would store the 
rectangle at the west boundary only, with and Ps being — oo and oo respec- 
tively. Similarly, a rectangle unbounded in the east and west is stored at the 
south boundary only, with pyj and Pe being — oo and oo respectively. Also, an 
MER that is unbounded in two adjacent sides, for example in north and west, are 
stored both in the west and south, in the manner outlined in the last paragraph. 

A rectangle that is bounded on only one side can be and is also represented. 
An MER that is bounded only in the south, is represented at the south at the 
point which bounds it there. The west and east bounding vertices are represented 
by — oo and oo respectively. Again, an MER that is bounded only in the north, 
is represented at the south at the corresponding entry in the 0-th row of the 
matrix. The west and east boundaries are represented as in the last case. We, 
however, do not store any of these MERs at the west. In an exact symmetric 




38 



Jeet Chaudhuri and Subhas C. Nandy 



manner, we can take care of rectangles that are bounded only in the west or in 
the east. 

Observation 1 i) Given a fixed grid angle, and a pair of points Pn and ps, if 
there exists an MER whose opposite sides contain and ps respectively, then 
its bounding blue lines are unique. 

a) Given a fixed grid angle, and a pair of points Pw and Pe, if there exists an 
MER whose opposite sides contain Pw and Pe respectively, then its bounding red 
lines are unique. □ 

The matrix M. is initialized by the set of all axis-parallel MERs’ which are 
obtained by invoking the algorithm presented in [6], and it requires 0{R+nlogn) 
time, where R is the total number of MERs whose sides are parallel to the 
coordinate axes. 

3.2 Data Structure 

We consider the set of points P as an array whose each element contains its 
corresponding row and column numbers in the matrix M. . Also we maintain an 
array P' whose elements correspond to the rows in Ad in a bottom to top order. 
An entry of P' is the index of the point in P corresponding to that row. In an 
exact similar fashion, we have another array P" , whose elements correspond to 
the columns ordered from left to right, and its each entry stores the index of the 
point in P, corresponding to that column. Each element of Ad now consists of 
the following fields : 

(i) a value field containing 1, 2 or 4 depending on whether the corresponding 
entry represents a point in P, or stores an MER at its south boundary or 
stores an MER at its west boundary. The value field may also contain a 
sum of any of these three primary values, to denote that the corresponding 
cases occur simultaneously. If none of these cases occur, the value field 
contains 0. Now, it may be observed that any point in the set P shall 
always bound an MER in the south that is unbounded in the north, and 
bound an MER in the west that is unbounded in the east. In view of this, 
it is evident that once this field contains 1, it should also contain 2 and 4 
respectively. Thus, a matrix entry representing a point in P should contain 
7 (1-I-2-I-4). A matrix entry representing only an MER stored at its south 
(west) boundary, will contain 2 (4). A matrix entry representing two MERs, 
one of them is stored at its south boundary and the other one is stored at 
its west boundary, will contain a value 6. By Observation 1, there exists no 
matrix entry representing more than one MERs all stored at their south 
boundary, or all stored at their west boundary. Thus the possible values of 
the matrix entries are 2, 4, 6 or 7. 

(ii) two pointers PI and P2 storing the (indices in P of) two different points 
which appear on the west and east boundaries of the MER represented 
by that element at the south. Thus, if the value field is 4 or 0, PI and P2 



Largest Empty Rectangle among a Point Set 



39 



contain NULL. Again, if the MER represented by an element is unbounded 
in a particular side, then also the corresponding pointer is set to NULL. 

(iii) two more pointers P3 and P4 storing the (indices in P of) two different 
points which appear on the north and south boundaries of the MER rep- 
resented by that element at the west. Thus, if the value field is 2 or 0, P3 
and P4 contain NULL. Again, if the MER represented by an element is 
unbounded in a particular side, then also the corresponding pointer is set 
to NULL. 

(iv) A pair of initial grid angles, one for the MER represented at the south by 
this entry, one for the MER represented at the west, if at all any/both 
is/ are represented. 

3.3 Grid Rotation 

Now we describe how the grid diagram changes due to the rotation of the grid. 
The rotation of the grid implies that for each point, the pair of mutually per- 
pendicular lines, passing through that point, are rotated gradually in an anti- 
clockwise direction, and all at the same rate. Let us imagine the rectangles 
embedded in the grid to be rotating also with the rotation of the grid as shown 
in Figure 2b. Now, for a very small rotation, although the rectangles change in 
size, their bounding points nevertheless remain same. However, when a pair of 
grid lines of the grid actually coincide, some rectangles might degenerate, some 
rectangles might be formed anew, while some may have its bounding vertices 
changed. At this stage we need to do the following : 

Update the data structure to account for the new set of MERs. For each such 
MER, we need to store the current grid angle at the appropriate place. 

The MERs that remain with the current set of bounding vertices after the 
rotation, do not need any computation. 

For the MERs (defined by a specified set of tuples) which were present in 
the data structure, but will not remain alive from the current instant of 
time onwards, we may need to compute the PMER, and update the data 
structures. 

In other words, in a particular orientation of the grid angle, say at 0 = (/, if a set 
of four points {pn,Pw,Ps,Pe} defines an MER for the first time, and if it remains 
valid for an interval <j) < 9 < ip oi the grid angle, then the entry in M. representing 
that MER is created when the 9 = <j>, and the PMER corresponding to the six- 
tuple {pn,Pw,Ps,Pe,4>:'^} is Computed when 9 becomes equal to ip during the 
grid rotation. Recall that we store the initial grid angle p for these set of MERs; 
it is actually done for this purpose. 

Note that, if we gradually rotate the grid by an angle ^ , we can enumerate all 
the PMERs that exists on the floor. Our aim is to find the one having maximum 
area. 

Selection of event points 

In order to perform the rotational sweep of the grid, we need to know the 
order in which a pair of grid lines of same color swaps. This requires a sorting of 




40 



Jeet Chaudhuri and Subhas C. Nandy 



the absolute gradients of the lines obtained by joining each pair of points. During 
the rotation of the grid, we need to stop O(n^) times when either the red lines 
or the blue lines become parallel to any of those lines. We consider two different 
sets containing all the lines having positive and negative slopes respectively. The 
lines in the first (second) set are sorted in increasing order of the angle 9 with 
the a;-axis (y-axis) in counter-clockwise direction. Easy to understand, in each 
set the lines are stored in increased order of their gradients. Finally, these two 
sets are merged with respect to the angle 9 considered for sorting. Needless to 
say, this requires 0{n^) space for storing the angles of all the lines, and the 
sorting requires O(n^logn) time in the worst case. But note that, we don’t need 
to store the gradient of all the lines permanently; rather we are satisfied if we 
get the event points (the angles) in proper order during grid rotation. In the 
full paper, we will show, using geometric duality, that the event points can be 
generated using 0(n) space. 

Some important properties of grid rotation 

Next, we come to the most crucial part of generating the new set of MERs 
and consequently updating A4 when a pair of red (blue) lines in the grid swap. 
This actually causes the swap of a pair of rows (columns) of the matrix A4 . 

Here we need to mention that while processing an event point, if the angle of 
the joining line of the corresponding pair of points with the x-axis is < | (> ^), 
it results in a swap of rows (columns) corresponding to those points. 

Lemma 1. While processing an event point corresponding to the line joining a 
pair of points {pa,pp), 

(a) if the angle of the line joining Pa and pp with the x-axis is less than 

then the MERs whose neither of the north or south boundaries contain Pa 
norpp, will not be changed with respect to their definition. 

(b) if the angle of the line joining Pa and pp with the x-axis is greater than 

then the MERs whose neither of the east or west boundaries contain Pa 
norpp, will not be changed with respect to their definition. 

(c) if the angle of the line joining Pa and pp with the x-axis is less than 

and Pa is to the left of pp, then MERs whose south bounding point is Pa, 
but pp does not appear on any of its sides, and MERs whose north bounding 
point is pp, but Pa does not appear on any of its sides, will not be changed 
with respect to their definition. 

(d) if the angle of the line joining Pa and pp with the x-axis is greater than 

and Pa is below pp, then MERs whose east bounding point is Pa, but pp 
does not appear on any of its sides, and MERs whose west bounding point 
is pp, but Pa does not appear on any of its sides, will not be changed with 
respect to their definition. 

In view of this lemma, we state the following exhaustive set of MERs which 
may take birth or die out due to the swap of a pair of rows corresponding to a 
pair of points, say Pa and pp {pa is assumed to be to the left of pp). A similar set 
of situations may also hold when a pair of columns swap; we will not mention 
them explicitly. 




Largest Empty Rectangle among a Point Set 



41 



Following is the exhaustive set of MERs that may die out due to the swap 
of two red lines corresponding to Pa and p/3. 

A : An MER with pa and Pjs at its the south and north boundaries respectively, 
B : A set of MERs with pa and pp on south and east boundaries respectively, 
C : A set of MERs with on their south boundaries, 

D : A set of MERs with pa and pp on west and north boundaries respectively, 
E : A set of MERs with pa on their north boundaries. 

And following is the exhaustive set of MERs that may possibly result due to the 
swap of two red lines corresponding to Pa and p,g. 

A' : An MER with pa and pp at its north and south boundaries respectively, 
B' : A set of MERs with pp and Pa on south and west boundaries respectively, 

C' : A set of MERs with pa on their south boundaries, 

D' : A set of MERs with pa on their north sides and pp appears on the east, 

E' : A set of MERs with pp on their north boundaries. 

In the following section, we shall highlight the necessary actions when a row 
swap takes place and also indicate how all the cases above are taken care of. 

Now, note that the MER in A modifies into the MER in A'. 

Further, the MERs in B all collapse to form members of C , and moreover, 
all the new members in C' are derived from B. In the latter case, to be a bit 
more explicit, there are actually two sets of MERs. First, the ones that have 
their north bounding points to the left of pp, and secondly, the ones that have 
their north bounding points to the right of pp. What is to be noted is that these 
north bounding points of the second class of MERs bound an MER of the first 
class at the east. 

Similarly, the members in C that collapse, result in members of B' if at all 
they remain, and conversely all the members of B' result from members in C. To 
be a bit more explicit about the former case: among the MERs that collapse in 
this case, ones having their north bounding points to the right of Pa only would 
still exist and degenerate into members of B' . Rest are all destroyed. 

Again, the MERs in D degenerate into MERs of E' , and all the newly in- 
troduced MERs of E' are derived from D. Similarly, all the members of E that 
collapse, result in members of D' if at all they remain, and every member in D' 
is derived from a member of E. These observations will guide our actions due to 
a row swap. 



3.4 Updating Grid Diagram 

We need to consider the two cases - (A) swap of two adjacent rows, and (B) 
swap of two adjacent columns, separately. When a new MER takes birth with a 
specified set of points in its four sides, it is entered in At along with the initial 
grid angle. When it disappears, we update the corresponding entries of At, and 
evaluate the PMER. Also, all the appropriate data structures are updated at 




42 



Jeet Chaudhuri and Subhas C. Nandy 



each stage. In this subsection, all references to storage of an MER at a matrix 
entry will imply at the south, unless otherwise specified. 

Swap of two adjacent rows 

Let the line joining (pa,P/ 3 ) is under process which is having the smallest 
absolute gradient among the set of unprocessed lines, and the gradient of that 
line is positive. We now study the effect of rotating the grid so that all red lines 
become parallel to the line joining (pa,p/ 3 ). Let i and t + 1 be the rows in M 
corresponding to the points Pa and pp, before the rotation. Thus after the swap, 
the rows corresponding to the points Pa and pp are i + 1 and i respectively. The 
columns corresponding to Pa and pp be k and £ {k < i) respectively. 

Step A : The MER with pa and pp at its the south and north boundaries 
respectively before the swap, is unbounded at its east and west sides. This MER 
will not exist further. But this gives birth to another unbounded empty rectangle 
with pfs and Pa at its south and north boundaries respectively. Thus the necessary 
changes in the value fields in M{i,£) and M{i + 1, k) need to be made. 





— - 




p^ 


p. 




















TP. 












P^ 


















Pa 

(a) 




Fig. 3. Illustration of (a) Step B.l, (b) Step B.2 



Step B : The MERs with pa and p,g on their south and east boundaries 
respectively before the swap (see Figure 3a), will eventually collapse; so, for 
each of them the corresponding PMER needs to be computed. First proceed 
along the t-th row, from extreme left and include each encountered MER in 
a set R till we reach i) an MER whose east vertex is to the left of pp, or ii) 
the entry j\4(i,k). In the latter case, the corresponding MER is included in R 
if it is bounded by pp at the east. Next proceed from the right along the t-th 
row to include in R the first MER that is bounded by p/j at the east, unless 
of course, this is the one represented at M{i,k). The last action is guided by 
the observation that there can be only one MER that is bounded by pp to the 
east. Pa to the south, and by a point to the right of Pa at the north. Now for all 
the members in R, pp will no longer remain in their east boundary, and we need 
to update their east boundaries as follows (see Figure 3). 

B.l Consider a rectangle in R whose north and west bounding vertices are p~^ 
and py, respectively. Note that before the rotation, there must have been 
a rectangle with p-^ and pp at its north and south respectively, and it 
is also bounded by p,p at its west. Further, if the east boundary of this 




Largest Empty Rectangle among a Point Set 



43 



second rectangle contains then the rectangle in R we started with, shall 
degenerate to be bounded by Pf^ at the east. Thus, for each member in i?, 
replace the PI, P2 pointers of their representative entries in the i-th row 
by the corresponding entries at the same column of z + 1-th row. If the 
member of R is however, the one stored at A4(z, k) and thus unbounded at 
the north, the PI, P2 pointers are obtained from those at A4(z -I- 1,£). 

B. 2 Note that, here a new MER appears with p^ and Pa on its north and south 

boundaries respectively. This would, in fact, be the case for each rectangle 
in R. In other words, the newly obtained east boundary vertices for each 
of the rectangles in P, would now bound a rectangle in the north that are 
bounded by Pa in the south. This gives rise to a new set of rectangles R' . 
But this case can also be tackled much like in B.l (as shown in Figure 3b). 
The details are omited in this version. 

Step C : Next we consider the set of MERs, each having south boundary 
containing pp. Due to the rotation, some of them will be truncated by Pa at 
their west side. The corresponding PMERs need to be reported and the matrix 
entries need to be updated to give birth to a new set of MERs. 

C. l The entries in row z-|- 1 (corresponding to pp) that store MERs are observed 

from extreme right one by one. If the west boundary of such an MER is 
observed to be to the left of Pa, we replace its west boundary by Pa- The 
scan terminates as soon as an entry is encountered which does not satisfy 
the above criterion, or the cell A4(i + 1,£) is reached. If the MER stored 
at AI(z -I- 1, f) has its west boundary to the left of then it is replaced 
by Pa- 

C.2 Next, we check the entries of the z -I- 1-th row that represent an MER at 
the south from the extreme left of that row. All the MERs that appear to 
the left of k-th column, i.e., whose north boundary is defined by a point to 
the left of Pa, will not remain valid after the current rotation (See Figure 
4a). The search continues along that row past the k-th. column, to detect 
the MERs having their west bounding vertices to the left of Pa- These 
set of MERs will be truncated by Pa to the west. We stop when the first 
MER with its west boundary to the right of Pa is encountered or the cell 
1,^) is reached. 

Step D : We now consider the set of MERs having pa and at their west 
and north boundaries respectively. These MERs will now collapse as a result 
of this swap. This case is exactly similar to that in Step B, and the new set of 
MERs shall be determined in the same way by traversing along the column £ in a 
downward direction from A4(z-|- 1, ^), and collecting all the rectangles in a set R, 
until i) an MER is obtained which is not bounded by Pa to its west side, or ii) 
the bottom of the column is reached. In the latter case, the MER represented 
at the 0-th row(and consequently unbounded at the south) is included in R, if 
it has Pa as its west vertex. Note that, after the current grid rotation, this set 
of rectangles will no longer be bounded by Pa towards west. 




44 Jeet Chaudhuri and Subhas C. Nandy 

D.l Next, consider an MER in the set R. It is bounded in the north and west 
by Pf 3 and Pa respectively. Suppose its south and east sides are bounded 
by p 0 and p^ respectively. After the rotation, surely this MER is not going 
to remain bounded at the west by Pa- But observe that before the rotation, 
in this case, there would be an MER bounded by Pa on the north and pg 
on the south, and it would also have Prj on the east. If this rectangle is 
bounded by ps in the west, then surely the rectangle in R we started with, 
will have ps at its west after the rotation (as shown in Figure 4b). Thus 
the PI, P2 pointers in the entries of the Pth column corresponding to the 
members in R are obtained from those in the same row and fc-th column. 

D.2 It is to be noted, exactly as in Case B, that as a result of rotation, the 
point PS will also bound a MER at the south that is bounded by pp at the 
north. This is however, true for all the MERs in R. So, a new set of MERs 
R' thus arises, which can be identified much like in Step D.l. 




Fig. 4. Illustration of (a) Step C, (b) Step D.4 



Step E : Some of the MERs with pa on their north boundaries, might be 
affected with pp entering them due to rotation, and will be either truncated on 
the east by pp, or simply destroyed. The situation is similar to that in step C. 

Processing for this step involves a traversal along the k-th column in a down- 
ward direction. The MERs encountered are eliminated if their south boundary 
vertices are to the right of pp, and otherwise truncated by pp at the east, if the 
latter is to the left of the east vertex of this MER. 

Here, the traversal terminates once (i) an MER is encountered which is 
bounded by a point in the east that is to the left of pp or (ii) bottom of the 
column is reached. In the second case, however, we perform the above process- 
ing if the east boundary of the MER represented there is to the right of pp. 

Finally, after the computation of the PMERs, and the necessary updates 
in A4 , we swap the two rows i and i -I- 1 of the matrix M . This swap requires 
0{n) unit of time. The row-id of the points Pa and pp in P will be changed, 
and Pa and pp will be swapped in P'. 

An important activity to be taken care of is the updating of the representation 
of an MER at its west, once it is introduced or deleted, provided it is bounded 
either in east or west. But this is easy upon noting that, when an MER is deleted 
or introduced, we know the exact set of bounding points. 






Largest Empty Rectangle among a Point Set 



45 



One crucial point is to be kept in mind while making this update. The steps 
have to be executed exactly in this order. This is because one step ahead of 
another may corrupt the values being used by the other. As an example, the 
updates in row i in Step B depend on the existing entries in row i + 1. If we 
execute Step C ahead of B, it is evident that the entries in row i+l get corrupted. 

Swap of two adjacent columns 

The data structures, as also our approach is evidently exactly symmetric with 
respect to the rows and columns. Hence an exactly same process is followed in 
case of a column swap-over. 

Note : By our design, an MER that is unbounded in both north and south is 
never represented at the south. This could lead to a potential problem because, 
when two rows swap, this MER will not get updated in our data structures. But 
note that such an MER can never be changed in terms of its determining vertices 
if two rows are interchanged. Indeed they can be modified only if two columns 
swap, and we at once store them at the south once they become bounded either 
at the south or north. 

Symmetrical is the case for MERs that are unbounded at both west and east. 

4 Complexity Analysis 

As discussed in the preceding sections, our algorithm consists of two phases, (i) 
deciding the event points, i.e., the grid angles, and (ii) the management of grid 
diagram during each step of rotation. The first phase requires 0{n?\ogn) time. 
Now it remains to analyze the time complexity of the second phase. 

The construction of initial grid matrix Ai requires 0{n?) time in the worst 
case. Next, we process O(n^) event points. For each such event point involving 
a pair of points Pa and pp, the procedure outlined in the last section traverses 
atmost a pair of rows and a pair of columns, and does not more than constant 
time processing at each entry, thereby involving a total of 0(n) time. Moreover, 
all the PMERs are evaluated in our algorithm. Thus, we have the final theorem 
stating the time complexity of our algorithm. 

Theorem 1. (a) The number of PMERs generated during the entire processing, 
may be 0{n^) in the worst case, and (b) the time complexity of our algorithm is 
also O(n^) in the worst case. □ 

Asann-|-lxn-|-l matrix is maintained throughout the processing, the space 
complexity is O(n^). 



References 

1. A. Aggarwal and S. Suri, Fast algorithm for computing the largest empty rectangle, 
Proc. 3rd Annual ACM Symp. on Computational Geometry, pp. 278-290, 1987. 
34 

2. D. P. Dobkin, H. Edelsbrunner and M. H. Overmars, Searching for empty convex 
polygons, Proc. 4th ACM Symp. on Computational Geometry, pp. 224-228, 1988. 
34 



46 



Jeet Chaudhuri and Subhas C. Nandy 



3. A. Naamad, D. T. Lee and W. L. Hsu, On the maximum empty rectangle problem, 
Discrete Applied Mathematics, vol. 8, pp. 267-277, 1984. 34 

4. S. C. Nandy, B. B. Bhattacharya and S. Ray, Efficient algorithms for identifying 
all maximal empty rectangles in VLSI layout design, Proc. FSTTCS - 10, Lecture 
Notes in Computer Science, vol. 437, Springer, pp. 255-269, 1990. 34 

5. S. C. Nandy, A. Sinha and B. B. Bhattacharya, Location of largest empty rectangle 
among arbitrary obstacles, Proc. FSTTCS - 14, Lecture Notes in Computer Science, 
vol. 880, Springer, pp. 159-170, 1994. 34 

6. M. Orlowski, A new algorithm for largest empty rectangle problem, Algorithmica, 
vol. 5, pp. 65-73, 1990. 34, 38 



Renaming Is Necessary in Timed Regular 
Expressions 



Philippe Herrmann 



LIAFA, Universite Paris 7 
2, place Jussieu F-75251 Paris Cedex 05 France 
her rmannOl iafa.jussieu.fr 



Abstract. We prove that timed regular expressions without renaming 
are strictly less expressive than timed automata, as conjectured by 
Asarin, Caspi and Maler in [3], where this extension of regular expres- 
sions was introduced. We also show how this result allows us to exhibit 
an infinite hierarchy of timed regular expressions. 



1 Introduction 

Among the different models that have been developed in order to describe real- 
time systems, the timed automata of Alur and Dill [2] are particularly interesting 
since they provide a timed counterpart of finite state automata, which have been 
studied intensively by the formal languages community. Thus it is only natural 
to try to adapt well-known results about finite state automata, of which there 
are plenty of, to the more general picture of timed automata. A basic result of 
automata theory being Kleene’s theorem [7], stating the equivalence between fi- 
nite automata and regular expressions, Asarin, Caspi and Maler designed timed 
regular expressions in [3]. They proved that, at the price of augmenting regular 
expressions with a very natural timed restriction operator, and a somewhat less 
natural but indispensable conjunction operator, one gets timed regular expres- 
sions which are exactly as expressive as timed automata up to renaming. That 
is, every timed regular expression is equivalent to a timed automaton, while 
every timed automaton can be translated into an equivalent timed regular ex- 
pression provided a subset of the actions are renamed. Asarin, Caspi and Maler 
conjectured that renaming is indeed necessary to get the full expressive power of 
timed automata. In this paper we prove that their conjecture was indeed correct, 
namely that one cannot get rid of the renaming. To achieve this goal, we first 
give in Sect. 6 an easy proof of the necessity of introducing the conjunction oper- 
ator, a result that already appeared in the original paper [3], but with a different 
construction. Then we proceed in Sect. 7 with the much more involved proof of 
the necessity of renaming. Finally, in Sect. 8, we introduce an infinite hierarchy 
of timed regular expressions, based on the number of conjunction operators and 
the use of renaming. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 47—59, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



48 



Philippe Herrmann 



2 Timed Regular Expressions, Timed Automata 

2.1 Notations 

Let L" be a finite alphabet and let IR>o denote the set of nonnegative reals. A 
timed language is a subset of (A x IR>o)~'', i.e. a set of timed words (note that 
the notion of empty word does not appear in our definition) . 

Intuitively, a timed word w = (oi, (5i)(a2, 1 ^ 2 ) • ■ • (an,A) corresponds to an 
action ai of duration (5i , followed by an action 02 of duration 62 ■■ ■ and ending 

n 

by an action a„ of duration (5„. The duration of w is denoted by 5{w) = 5k- 

k—1 

To such a timed word we associate a non-decreasing sequence (to, ti, • • • , tn) 
with to = 0 and tk+i = tk + Sk+i (the intended meaning being that action ak 
starts at time tk-i and ends at time tk). So Si will be called the duration of 
event t, while t^ will be its time stamp. Sequences of durations and sequences of 
time stamps are in bijection. 

A timed interval is an interval of IR>o of the form [a, 6], [a, 6), (a, 6], (a, 6), 
[a, -hoo) or (a, -l-oo) with a and b in IN. We denote [a, a] by a. 

Let tr be a mapping from E to E. The domain and range of cr can be extended 
to A X IR>o by letting a{{a,5)) = (cr(a),(5) for a G E. We also denote by cr the 
generated morphism from (A x IR>o)~'’ to (A x IR>o)~'’ and call it a renaming. 

3 Timed Regular Expressions 

Timed Regular Expressions or TREs were introduced in [3] and are defined 
inductively as follows (we let (/), (j)i and (j )2 be TREs): 

— a is a TRE, with a G E (atom); 

— (j)i A cj )2 is a, TRE (conjunction); 

— V ^2 is a TRE (disjunction); 

— (f>i ■ (p 2 is a TRE (concatenation); 

— is a TRE (iteration); 

— is a TRE (restriction), where / is a timed interval. 

The interpretation of a TRE is given by [[•] which is a function from the set 
of TREs to the set of timed languages defined inductively by: 

— [aj = {(a,<5) I 5 G IR>o} for a € A; 

— Ih A (^2! = [<^il n l(j)2f, 

— {( 1)1 V M = IM u [[(/)2]1; 

— I«!>1 • M = {wiW2 I Wl G IM^W2 G [[(/)2l} = I^ll • [[(/)2l; 

— [<(>■^1 = lJ{wi...Wfe I Wi,... ,Wfe G M}= IJ 

fc>0 k>0 

— = {w \ wG M,(5('u;) G /}. 



Renaming Is Necessary in Timed Regular Expressions 49 

Note that the iteration operator is a Kleene plus, not a Kleene star: for any 
TRE e ^ |(^]]. Also we do not use signals, but timed words. We do so in order 
to simplify subsequent proofs, as pointed out in Sect. 9. 

The function sons is defined on TREs and returns a set of TREs: for an 
atom it returns the emptyset, for the conjunction, disjunction and concatenation 
it returns the two sub- TREs , and for the iteration and restriction it returns the 
single sub- TRE. The syntax tree of a TRE cj> is a tree with root (j), and each 
node ip has sons{tjj) as sons. A eut is a set of nodes (i.e. TREs) that contains 
exactly one node from every branch from the root to a leaf. 

A A-free TRE (resp. V-free) is a TRE containing no A (resp. V) operator. 

A TRE can be put in disjunctive normal form (disjunction of a finite number 
of V-free TREs) using the fact that conjunction, concatenation and restriction 
distribute over disjunction. For the iteration, we check that the following identity 
on regular expressions extends to timed ones: 

= [[<(.+ v4 V (<^+-4)+v (</>+•</.+)+ v(<(.+ -4)+-(/)+v (</.+ •</.+)+ -<(.+1 

3.1 Timed Automata 

A timed automaton or TA (see [1]) is a tuple A= {E,Q, S, F, X, E) where E is 
a finite set of actions, Q a finite set of states, S C Q is the set of initial states, 
F C Q the set of final states, X a finite set of clocks, E a finite set of transitions 
of the form {q,q' ,a, g, p) where q and q' are states of Q, a an action, g is a 
condition on clocks (see below) and p a subset of X called the reset set of the 
transition. 

A condition on clocks is a possibly empty conjunction of terms of the form 
X G I where x is a clock and / a timed interval (we say that x G I is part of 
the condition). Recall that the ending points of a timed interval are integers or 
-|-oo. A clock valuation is a vector v G (IR>o)^: hence the valuation of clock x 
in V is denoted by i'{x). For a reset set p Q X and a clock valuation v, we 
define Reset p{u) G (IR>o)^ by Reset p{v){x) = b ii x G p and v{x) otherwise. A 
condition on clocks holds or not for a given clock valuation, and the evaluation 
of a condition g on a valuation v is denoted by g{v). We define the valuation 
1 / S with 6 G IR>o by (ly -h S)(x) = u{x) + 5. 

Let t: = qo qi qn with n > 0 be a path of a TA A (i.e. qu Gi Q 

for 0 < k < n and the source state oi Ck G E is qk-i while its goal state is qk for 
1 < A: < n). For such a path, the reset set of Cfc will be denoted by pk (we let 
Pq = X by convention), its condition on clocks by gk and its action by Ok- The 
path is accepting ii qo G S and qn G F (i.e. it starts in an initial state and ends 
in a final state). The trace of tt is the untimed word oi • • • a„. A cycle is a path 
which starts and ends in the same state. Now a run of A associated with tt is a 
sequence 



/ \ ei,(5i , s 62,(52 

(go,i^o) ^ ' 



6n ,^Ti 



(^n ; ^n) 



(where Vk is ^ clock valuation and Sk € IFl>o) which satisfies the additional 
conditions: gk{^k-i + ^k) holds and Uk = Reset pj^{uk-i + (5fc). The trace of this 



50 



Philippe Herrmann 



run is the timed word (oi, 5i) • • • (a„, i5„). A run is accepting if the associated 
path is accepting and i/q = {O}'^ (i.e. the clocks are set to 0 initially). 

The semantics of a TA A is a timed language C{A) defined as follows: w € 
C{A) iff w is the trace of an accepting run of A. 

4 From TREs to TAs 

This section is an adaptation of the translation from expressions to automata 
of [3]: recall that we do not allow the empty word as part of our languages. 

A TRE <f) can be translated into a TA A such that [[(/)] = C{A). To see this, 
we proceed by structural induction. For the base case, note that an atom a € E 
corresponds to the TA ({a}, {go, 9i}, {9o}, {<7i}, 0, {(9o, 9i, a, 0)}). For the 
induction hypothesis, let 4 >i correspond to (Ai, Qi, Si, Fi, Xi, Ei), while 4>2 cor- 
responds to {E2,Q2, S2, F2,X2, E2), where the TAs are such that Qi and Q2 
as well as Xi and X2 are disjoint, and such that no final state has an outgoing 
transition (this is verified in the base case). Then we have: 

— 4 > = becomes {Si U S2, Qi x Q2, Si x 52,^1 x F2, Xi l±l X2, E) where 

E is defined as the smallest set such that {qi,q'i,a, gi, pi) € Ei and that 
{q2,q'2,a,g2,P2) G E2 implies ((gi, 92), (g{, g^), 5i ^ 92 , Pi ^ P2) G E; 

— (f> = (f>i V (f>2 becomes {Si U S2, Qi W Q2, W S'2, Af l±l F2,Xi l±l X2, Ei l±l E2)', 

— (f) = (f) I ■ 4>2 is translated into {Si U A2,Qi\Fi tt) Q2, ^i, F2, Ai tt) X2,E) 
where E is obtained by replacing every transition {q,q' ,a, g, p) of Ei with 
q' G El by the set of transitions |(g, g", a, g, A2) | q" G S2} and leaving 
the other transitions just as they are (the fact that we do not recognize the 
empty word and that Fi has no outgoing transition plays a crucial role here). 
Note that all clocks of X2 are reset when moving from the automaton of (pi 
to the automaton of <p2 and that the clocks in Xi will never be used again; 

— (j> = (pf is translated into {Si,Qi, Si, Fi, Xi, E) where E is obtained by 
adding to Ei the set of transitions j(g, g", a, g, Ai) | g" e S'!} for each 
transition {q,q' ,a,g, p) of Ei with q' e Fi. Note that all clocks are reset 
after each iteration; 

— p = {(pi) I is translated into (Ai, Qi, ^i, Fi, Ai l±) {a;}, E) with x ^ Xi, where 
E is obtained by changing every transition {q, q' , a, g, p) of Ei with q' S Fi 
into a transition (g, q' , a, g A {x G I) , p) (the fact that no state of Fi has an 
outgoing transition is important here); 

At each step, we check that no final state has an outgoing transition, and 
that the constructed TA and p have the same semantics (the proofs are left 
to the reader). The TA obtained from p in the way we just described will be 
denoted by A^. 

5 Technical Results 

This section presents a few technical lemmas based on ideas from [4] and [6] . 
Let A be a TA, tt = go gi qn be a path of A. Let po = X and 

Cij = {/ I 3x such that x G pi\{pi+i U . . . U Pj-i) and {x G I) is part of gj} 



Renaming Is Necessary in Timed Regular Expressions 



51 



It is the set of timed intervals being part of the condition on clocks gj involving 
clocks that are reset in pi and not reset again until at least pj. With tt we 
associate a graph denoted by G(7 t) with n + 1 nodes labeled by 0 . . .n, and 

containing for all nodes i < j s.t. Cij yf 0 the edge i — j with lij = [^ c. 



We will also say that G(7 t) is a graph of A as an abbreviation. 

k 

Almost by definition (recall that tk = Sk)- 

i=l 



cGCi J 



Lemma 1. Let tt be a path of a TA A with traee{Tr) = oi • • • a„. Then we have 
that the timed word (ai, i5i) • • • (a„, 5n) is the trace of a run associated with tt iff 

for all edges i — ^ j of G(tt) we have tj — ti G hj. 



Proof, see [6]. 

A graph G(7 t) contains a crossing denoted by i j k 1 (where i < j < k < I 
are nodes of G(7 t)) if there is an edge i > k and an edge j > 1. 



The following result gives a characterization of A-free TREs: 

Lemma 2. Let (f be a A-free TRE, and let At/, be its associated TA. Then no 
graph of A 4 , contains a crossing. 

Proof. Proceed by structural induction on A-free TREs. The result is trivial 
for atoms. For disjunction, note that the paths and hence graphs of the cor- 
responding TA are obtained by taking the union of the paths of the sub- TAs. 
For concatenation and iteration, the fact that the clocks are reset respectively 
when moving from the first to the second TA or when starting a new iteration 
shows that the graphs of the resulting automaton are obtained by concatenating 
certain graphs of the sub- TAs. For restriction, edges are only added between the 
first and the last node of certain graphs (those associated with accepting paths) . 
All these operations cannot create a crossing. The A-freeness is important since 
conjunction corresponds roughly to the ‘superposition’ of graphs which may very 
well create a crossing. 



Let TT = go • • ■ Qn (resp. tt' = Qq - • • q'ff) be an accepting path in a TA A 
(resp. A'). We write that tt' < tt {tt' is less constraining than tt) iff 

— trace(Tr) = trace(Tr'); 

— i — APf j jjj G(7 t') implies i — AA j in G(7t) with Aj C . 

Note that A is a partial order. This definition is motivated by: 

Lemma 3. Let (j) be a \/ -free TRE such that A^ has an accepting path tt of 
length n for which there is a crossing 0 i j n in G(tt). Then for all (j)' G 
sons{4>), Atj,' has a path tt' satisfying tt' ^ tt. 



52 



Philippe Herrmann 



Proof. The different possible cases are: 

— 4> = (fi f\ 4>2'- the accepting path tt can only be obtained as the synchronous 
product of an accepting path tti in and an accepting path 7T2 in 
with the same traces as tt. The condition on edges is ensured by the fact 
that the constraints in A,f, are obtained by conjuncting constraints of 
and Aif,^] 

— (/) = (/)i • (/)2: TT is obtained roughly by concatenating a path tti of A(j,.^ and 
a path 7T2 of A(f,2- All the clocks are reset when we ‘move’ from tti to 7T2: 
this contradicts the fact that there is a crossing 0 i j n in tt (recall that 
the crossing starts in the first node and ends in the last). Hence this case is 
impossible and we have nothing to prove; 

— 4> = (<?!>i)“'": A(j) is obtained from A^^ by adding edges resetting all clocks. 
Since there is a crossing 0 i j n in G(7 t) (starting in the first node and 
ending in the last), these edges cannot be part of tt. Therefore tt also appears 
as an accepting path in A ^^ ; 

— 4> = since A4, is obtained from A^^ by adding a condition on a new 

clock on edges ending in a final state, the path tt in A(f, already appears in 

and the only difference in the corresponding graphs is for the edge from 
node 0 to node n for which the second condition holds. 

The lemma is true for TREs which are not V-free as long as they are not of 
the form cj>i\/ 4>2- Also note that cj> = cj>i ■ (f>2 is impossible. 

Corollary 1. Let (j) be a \/ -free TRE sueh that A^ aecepts a timed word w 
through a path tt of length n for whieh there is a crossing 0 i j n. Then for 
all (f)' € sons{4>), w € |<()']. 

Proof. We use lemma 3 and lemma 1 . 

6 Necessity of the Conjunction Operator 

In the usual definition of (untimed) regular expressions, there is no conjunction 
operator such as A: regular expressions and finite automata having the same 
expressive power, it is clear that such an operator is unnecessary. In the timed 
case, we wish to show that we cannot get rid of the intersection operator A if 
we want TREs to be as expressive as TAs. A more involved proof of this result 
appears in [ 3 ] in the case of signals. 

Proposition 1. Let (f = ((a • 6)1 • c) A (a • {b ■ c)i). There is no A-free TRE </)' 
such that |(/)'] = |(/)]. 

Proof. Let (f) = {{a ■ b)i ■ c) A {a ■ {b ■ c)i). We proceed by contradiction: sup- 
pose there exists a A-free TRE </>' such that |<()]] = It is clear that the 
timed word w= (a, |)(6, |)(c, |) is recognized by A^' through an accepting 
path 7T. By lemma 2 , we obtain that the graph G(7 t) doesn’t contain any crossing 

0 12 3 . Hence we cannot at the same time have an edge 0 — 2 and an 



Renaming Is Necessary in Timed Regular Expressions 



53 



edge 1 > 3. Suppose there is no edge 0 °'^ > 2. Let us consider node 2 (corre- 

sponding to the end of the b event): it may share edges only with nodes 1 and 3, 
and by lemma 1, the timed intervals I\^2 and /2,3 (if they exist) must contain 
(0, 1), since t2 — = ^3 — ^2 = 5- Using lemma 1, we see that the timed word 

w’ = {a, 5) (6, |)(c, j) is recognized by through tt. Hence w' S |(()'], which 

/i 3 

is a contradiction since w' ^ |</)]]. The case where there is no edge 1 — ^ 3 is 
similar. 



7 Necessity of Renaming 

In this section we prove the fact that TREs alone cannot express the whole 
power of TAs. In fact we need to add renaming, defined earlier in Sect. 2.1. In 
order to prove that renaming is indeed necessary, which is a problem that was 
left open in [3], consider the TA A of Fig. 1. It is easy to see that C{A) = crdf/)]) 
where (p = {a~^ • 6)1 • a+ A a'*' • (6 • a+)i and cr maps a and b to a. Note that crdf/)]) 
and [[(a“'" • a)i • a"*' A a“'" • (a • are not equal for reasons of synchronization on 

the b event in p: this is the idea we exploit to prove the necessity of renaming. 




Fig. 1. the TA A 



Proposition 2. There is no TRE ip such that |'0] = cr(|(()]). 

Proof. We proceed by contradiction. So let us suppose that there exists a TRE ip 
such that = (T ([[(()]). We may assume that ip is in disjunctive normal form, 

n 

i.e. iP=\/ ipi where the ipfs are V-free. Hence A^ij, is the juxtaposition of n TAs 

i —1 

.4.03 , • ■ ■ , Ai/j.^ ■ Let K be an integer greater than the number of states of 
(and big enough for all subsequent constructions). At least one of the TAs, say 
^03 w.l.o.g., accepts the timed word w = (a^*", (<jj)i<i<4ic), where the sequence 
of durations/time stamps is any sequence satisfying the following: 

— to = Q < t\ < t2 < ■ ■ ■ < tiK] 

— t2K = 1 ; 

— t2K-l = t4K — 1; 

— t4K-l < ti + 1. 



54 



Philippe Herrmann 



The constraints on this timed word may be visualized by: 

1 

I 1 

0 1 ••• 2K-1 2K ••• 4K-1 4K 

I I ^ I 

I i ^ 

< 1 

We define the predicate crossing{x) where x is a TRE. It is true iff has 
at least one path tt through which w is recognized and such that G(7 t) has a 
crossing 0 i j 4 K . Also let keep be the set of TREs satisfying -^crossing (in 
particular it contains x whenever w ^ |xD- 

We consider the way is built, that is the syntax tree of this TRE: note that 
since ipi is V-free, it is also the case for all nodes of the syntax tree. Moreover 
the number of states of any TA associated with a node of the syntax tree is less 
than K. 

We construct a sequence of cuts {Ci)i>o in the following way: 

— Cl is the root of the tree (corresponding to tpi); 

— Ci+i = {Ci n keep) U sons{Ci \ keep) (it is clearly a cut if Ci is a cut because 
keep includes all leaves of any syntax tree) . 



Lemma 4. Eor all i > 1, for all x^Ci, w G |xl- 
Proof. We proceed by induction. 

— base case: Ci = {tfi}, and by hypothesis w G [[V'll; 

— • induction hypothesis (IH) : all members x of Ci are such that w G [[xl ; 

• from Ci to Ci+i: let x' G C^+i. Either x' G Ci and we use the IH directly, 
or x' G sonsfx) with x ^ Ci. But then crossing{x) is true, and since x 
if V-free (just as any node of the syntax tree) we apply corollary 1 to 
prove that w G [xl- This ends the proof 

Since the syntax tree is finite, there is a rank I for which Cj+i = C/ H keep= Ci . 
Thus for all x ^ Ci, recognizes w through a path ir = qq ■ ■ ■ q^K with no cros- 
sing 0 i j 4 K. And since K is greater than the number of states of any node 
of the syntax tree, there exists 1 < ji < j 2 < 2 AT — 1 and 2K < j 3 < j 4 < 4AT — 1 
such that = qj^ and qj^ = qj^ (‘left and right cycles’). 

For X & Cl, let ln{x) denote the product (j 2 — ji){j 4 ~ js) of the lengths of 
the aforementioned cycles. Let 



T = n 

xeCi 

Now let w' = (5')i<i<4ic+L) be any timed word satisfying: 

— tg = 0 < t'l < • • • < 



Renaming Is Necessary in Timed Regular Expressions 



55 



— f' — 1 • 

^2K+L — 

— f' — f' — 

^2K-1 — ‘'4K+L 

“ ^4K-1+L <^1 + 1- 

The constraints on w' can be obtained from those on w by shifting positions 
2K ■ ■ ■ 4K from L places to the right as can be visualized by: 

1 

I 1 

0 1 ••• 2K-1 ••• 2K+L ••• 4K-1+L 4K+L 

I I ^ I 

I i ^ 

< 1 

Lemma 5. For all \ & Ci, w' G |x]. 

Proof. We know that recognizes w through a path tt such that G(7 t) has 
no crossing 0 i j 4: K ; also, tt is such that there exists 1 < ji < j 2 < 
2K — 1 and 2K < < ji < 4K — 1 with qj.^ = qj^, qj^ = qj^, and satis- 
fying the fact that (j 2 — ji){j 4 — js) divides L. Hence tt = /3iai/3202/33, with 

\Pi\ = ji, |ai| = j 2 - ji, |/32| = J 3 - J 2 , \oi 2 \ = ji - J 3 and \f3s\ = 4K - ji. Let 

P2ce2f3s, and let tTj. = (is. There- 

fore TTj. and TT; are accepting paths of A^ and their traces are the same and 
equal to 0 ^*"“'"^. We now show that w' is recognized by A^ through tt/ or by 
examining the edges of G(7 t). Since w is recognized through tt, we get by using 
lemma 1: 

~ for all 0 — ^ i in G(7 t), 

• if t < 2R: then (0, 1) C /; 

• if i = 2K then 1 G I; 

• if i > 2K then (1,2) C /; 

~ for all i j in G(7 t), if 1 < i < j < 4if — 1 then (0, 1) C /; 

— for all i 4K in G(7 t), 

• if t < 2RT- 1 then (1,2) C /; 

• if z = 2K — 1 then 1 G I; 

• if z > 2K — 1 then (0, 1) C I. 

All possible edges are considered. We distinguish two cases: 

Case 1. Suppose there is no edge z AK with z < 2K — 1 in G(7 t). In this 
case, we consider the edges of G(7 Tj). By the construction of tt; from tt we used, 
particularly the fact that the transitions appearing in tt/ and tt are the same, we 
get that (only the first case is detailed, the others are similar): 

— for all 0 — ^ z in G(7T/), 



56 



Philippe Herrmann 



• if i < 2K + L then (0, 1) C I: simply note that the transition correspond- 
ing to node i in G(7T/) already appeared in tt, and then it was attached to 
a node < 2K (otherwise it would be shifted farther to the right), hence 
the constraint either disappears because of a clock reset in the cycle or 
stays the same as before the iteration of the cycle (that is the same than 
for an edge 0 i with i < 2K in G(7 t)); 

• if i = 2K + L then 1 G I (same constraint); 

• if i > 2K + L then (1,2) C I (same constraint); 

— for all i — ^ j in G(7Ti), if 1 < t j < 'IK — 1 + L then (0,1) G I (the 
constraint either disappears because of a clock reset in the cycle or stays the 
same); 

— for all i — ^ 4:K + L in G(7T/), 

• i < 2K — 1 + L cannot happen (by hypothesis); 

• if i > 2K — 1 + L then (0, 1) C / (same constraint). 

By lemma 1, it is easy to see that recognizes w' through tt;. 

Case 2. Now suppose there is an edge i dit' with i < 2K — 1 in G(7 t): since 

there is no crossing 0 i j 4 K , there is no edge 0 j with j > 2K — 1. In 
this second and last case, we consider the edges of G(7 Tj.) and we get that 
recognizes w' through Tir- This ends the proof of Lemma 5. 

Lemma 6. w' G |'0i] 

Proof. Again we proceed by induction. 

— base case: all members x of Ci are such that w' G [[xj since we proved 
Lemma 5; 

— induction hypothesis (IH): all members x' of satisfy w' G [xl; 

— from Ci+i to Ci'. let x € If x € G^+i, then we use the IH directly. 
Otherwise, we distinguish four cases: 

• X = x' ^x” x^ x” € Gi+i: here w' G [[xl and w' G \x”\ immediately 
implies w' G |xl; 

• X = X'-X" with x',x" e G,+i: this case is impossible, as we noticed in 
the demonstration of lemma 3 and by construction of the sequence of 
cuts; 

• X = (xO"*" with x' G Gi+i: here w' G [xl immediately implies w' G [xl; 

• X = (x')i with x' G Ci+i: we know w G [[xl and S(w) G (1,2), and 
since / is a timed interval this implies (1,2) C I. Thus S(w') G I and 
w' G 1x1- 

This ends the proof of Lemma 6, and leads us to a contradiction since we have 
w' ^ cr([[(/)]) and therefore [V'] yf cr(|0]]). Thus Proposition 2 is proven. 



Renaming Is Necessary in Timed Regular Expressions 57 

8 An Infinite Hierarchy of TREs 

In this section we introduce an infinite hierarchy of TREs based on the number 
of A-operators and the use or not of renaming. A TRE is n-A if it is constructed 
with at most n operators A. Hence 0-A corresponds to A-free. In the syntax tree 
of a TRE, a A-node is any node of the form (pi A 4>2- A n- crossing for n > 0 is a 
set of pairs of nodes {{ik,jk) | A: = I . . . n)} satisfying the following requirements: 



n+1 

— {^[ik,jk]^^', 

fe=l 

— for ail k,l & [l,n+ I], k^l, [ik,jk] 2 [ihji]- 

One checks that a 2-crossing is simply a crossing (‘2 edges overlapping’). Also 
if a graph contains no n-crossing it won’t contain any (n-l- I)-crossing either. We 
can now state a generalization of lemma 2: 

Lemma 7. Let p he a n-A TRE, and let he its associated TA. Then no 
graph of Atf, contains any {n 2) -crossing. 

Proof. We proceed by induction over n. 

— base case (n=0): this is exactly lemma 2; 

— induction hypothesis (IH): we suppose that no graph of where p is 
any n-A TRE, contains any (n -|- 2)-crossing; 

— from n to n -I- 1: let (p he a {n -\- 1)-A TRE and consider its syntax tree. 
Let C be the unique cut consisting of nodes that are either leaves or A-nodes 
and such that none of their ancestors is a A-node. Each of the A-nodes 
of C has two sub-trees that have respectively C\ and C 2 A-nodes. We know 
that Cl -I-C 2 < n, thus we can apply the IH to the TREs corresponding to the 
two sub-trees of each A-nodes of C. Now we can check that if p\ (resp. P 2 ) is 
such that no graph of A^^ (resp. A^^) contains any (ci -I- 2)-crossing (resp. 
(c 2 -I- 2)-crossing), then no graph of A 4 >,iA<j >2 (i-®- A-node of C) contains 
any (ci -I- C 2 -I- 3)-crossing thus any (n -\- 3)-crossing. Now p is obtained from 
the nodes of C without using the operator A, since no ancestor of a node 
of C is a A-node, and by adaptating the proof of lemma 2 one can show 
that we cannot get a new crossing this way. Thus doesn’t contain any 
(n -|- 3)-crossing and this ends the proof. 



n 

Let us define the sequence of TREs {pn)n>o where pn = j\A ■ (a"“''^)i • a"“A 

2—0 

Clearly pn is n-A. We now give a generalization of Proposition 1, namely 
that pn is not (n — 1)-A, even if we allow renaming. Let a be any renaming: 



Proposition 3. There is no {n — 1)-A TRE p such that crdi/j]) = \pn\- 



58 



Philippe Herrmann 



Proof. Let us proceed by contradiction: suppose there exists a (n — 1)-A TRE ijj 
and a renaming a satisfying cr(|'0]]) = Note that the alphabet of if can 

be any finite set of actions, but that a must map every action to a. We check 
(^ 2 n+i^ (^Si)i<u< 2 n+i) belongs to if Si = • Thus there exists an untimed 

word X = xi' ■ ■ X 2 n+i such that w = (a;, (5i)i<u<2n+i) G IV’li ^cid in the TA 
this timed word w is recognized through a path tt of associated graph G(7 t). 
Since if is (n — 1)-A and by lemma 7, we know there is no (n + l)-crossing in 

this graph. Hence there exists 0 < k < n such that the edge k — ^ fc + n + 1 
doesn’t appear in G(7 t): it is easy to check by using lemma 1 that the timed 
word w' = (x, (Si)i<i< 2 n+i) with Si = Si for i k + n + 1 and i ^ k + n + 2, 
S'k+n+i = 2 (n+i) ^fe+n +2 = 2 (ra+i) rccognized by Atp through tt. Therefore 

a{w') = (^-)i<i< 2 n+i) S [[<?!>nl, which is as contradiction. 

Finally here is an easy result to complete the picture: 

Proposition 4. Let (f be a A- free TRE and a a renaming. Then there exists a 
A- free TRE if such that 1-0] = cr (|0J). 

Proof. We denote the morphism from to associated with cr by the same 
symbol. One checks by an easy structural induction that the TRE if is obtained 
by replacing every atom a appearing in (f by cr(a) (obviously this doesn’t work 
when the A-operator is present). 

Let 7(i = {[[0] I 0 is n-A} and 7(( = {cr(|0]]) | 0 is n-A and cr is a renaming}. 
Glearly 7), C n and C for n > 0. If we sum up the results we 
obtained, we get the following infinite hierarchy between these sets: 



T' 






T' 



T' 

''n+l 



Tn 



% 



'• % 



% 



n+1 



Moreover 7} and 7}' are incomparable for 0 < j < i (no inclusion holds). 

Proof. The equality between 7 q and Tq has just been proven in Proposition 4. 
Now for n > 0, Proposition 3 means that %i+i % 7}( (therefore % 7}, and 
^Vi 2 while for 0 < j < i) while Proposition 2 tells us Tf %Tn 

(which implies 7}(_|_;^ % 7},+i and 7}' ^ for 0 < j < f). 



9 Discussion 



We already mentioned the fact that in the original paper [3], the authors used 
a signal-based semantics for TREs, associating with any TRE what is actually 
a TA with silent transitions (while we use timed words and plain TAs). Thus 
one may wonder whether or not our results still hold if we allow silent actions e 
as atoms in our definition of TREs. We believe the core of the proofs of this 
paper would remain the same if we used those extended TREs, although quite a 
few technical difficulties would be added (the graphs now have ‘invisible nodes’ 
scattered among the nodes we considered). 



Renaming Is Necessary in Timed Regular Expressions 



59 



Acknowledgements 

Many thanks to Paul Gastin for his remarks that improved the paper a lot. The 
author also wishes to thank Eugene Asarin and the anonymous referees for their 
valuable comments. 

References 

1. R. Alur and D.L. Dill: Automata for Modelling Real-Time Systems. Proceedings of 
ICALP’90, LNCS 443, pages 322-335, 1990. 49 

2. R. Alur and D.L. Dill: A Theory of Timed Automata, Theoretical Computer Science 
126, pages 183-235, 1994. 47 

3. E. Asarin, P. Caspi and O. Maler: A Kleene Theorem for Timed Automata. Pro- 
ceedings of LICS’97, pages 160-170, 1997. 47, 47, 47, 48, 50, 52, 53, 58 

4. B. Berard, V. Diekert, P. Gastin and A. Petit: Characterization of the Expres- 
sive Power of Silent Transitions in Timed Automata. Fundamenta Informaticm 36, 
pages 145-182, 1998. 50 

5. V. Diekert, P. Gastin and A. Petit: Removing e-Transitions in Timed Automata. 
Proceedings of STACS’97, LNCS 1200, pages 583-594, 1997. 

6. P. Herrmann: Timed Automata and Recognizability. Information Processing Letters 
65, pages 313-318, 1998. 50, 51 

7. S.C. Kleene, Representations of Events in Nerve Nets and Einite Automata. Au- 
tomata Studies, pages 3-42, 1956. 47 



Product Interval Automata: A Subclass of 
Timed Automata 



Deepak D’Souza* and P. S. Thiagarajan** 

Chennai Mathematical Institute, 

92 G. N. Chetty Road, Chennai 600 017, India 



Abstract. We identify a subclass of timed automata and develop its 
theory. These automata, called product interval automata, consist of a 
network of timed agents. The key restriction is that there is just one clock 
for each agent and the way the clocks are read and reset is determined 
by the distribution of shared actions across the agents. We show that the 
resulting automata admit a clean theory in both logical and language- 
theoretic terms. It turns out that the study of these timed automata can 
exploit the rich theory of partial orders known as Mazurkiewicz traces. 
An important consequence is that the partial order reduction techniques 
being developed for timed automata [4,10] can be readily applied to the 
verification tasks associated with our automata. Indeed we expect this 
to be the case even for the extension of product interval automata called 
distributed interval automata. 



1 Introduction 

Timed automata as formulated by Alur and Dill [1] have become a standard 
model for describing timed behaviours. These automata are very powerful in 
language-theoretic terms. Their languages are not closed under complementa- 
tion. Further, their language inclusion problem is undecidable and hence cannot 
be reduced to the emptiness problem which is decidable. Hence in order to solve 
verification problems posed as language inclusion problems, one must use deter- 
ministic timed automata for specifications (which can be easily complemented) 
or one must work with a restricted class of timed automata that possess the 
desired closure properties. 

Here we follow the second route and propose a subclass of timed automata 
called product interval automata (PI automata). Roughly speaking, such an 
automaton will consist of a network of timed agents \\fLi where each Ai will 
operate over an alphabet Si of events. Further, there will be a single clock Ct 
associated with each agent i. The agents communicate by synchronising on the 
timed executions of common events. Suppose a is an event in which the agents 
{1, 3,4} participate. Then the timing constraint governing each a-execution will 

* A part of this work has been supported by BRICS, Computer Science Dept., Aarhus 
University. 

** A part of this work has been supported by the IFCPAR Project 1502-1. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 60-71, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Product Interval Automata: A Subclass of Timed Automata 



61 



only involve the clocks {ci, C3, C4}. Moreover, the set of clocks that is reset at the 
end of each a-execution will be {01,03,04}. Thus the distribution E = {Ei}fLi 
of events over the agents will canonically determine the usage oi{ clocks; so much 
so, we can avoid mentioning the clocks altogether once we fix S. 

This method of structuring timed automata has a number of advantages. In 
particular, one can provide naturally decomposed and succinct presentations of 
timed automata with large (control) state spaces. Admittedly, the technique of 
presenting a global timed automaton as a product of component timed automata 
has been used by many authors starting from [1]. What is new here, as explained 
above, is that our decomposed presentation places a corresponding restriction 
on the manner in which clocks are read and reset. It is worth pointing out that 
the model considered by Yi and Jonsson [24] in the framework of timed CSP 
can be easily represented as PI automata. Their main result, in our terms, is 
that language inclusion problem for PI automata is decidable. We establish a 
variety of results concerning PI automata which subsume the decidability of the 
language inclusion problem. 

In principle, one could view our automata as a restricted kind of labelled 
timed Petri net. However the semantics we attach to our automata is the stan- 
dard one for timed automata whereas the semantics one uses for timed Petri 
nets - with earliest and latest firing times for the transitions - is somewhat 
different. 

A key feature of our automata is that their timed behaviour can be symbol- 
ically represented without any loss of information by conventional words. As a 
consequence, it turns out that their theory can be developed with the help of 
powerful results available in the theory of Mazurkiewicz traces [6] . We wish to 
emphasise, our automata will however have a conventional timed semantics with 
the non-negative reals serving as the time frame. A final aspect of PI automata 
that we wish to mention is that partial order reduction techniques that are under 
development [4,10] can be readily applied to our automata. 

In pragmatic terms, it is not clear how much modelling power is lost through 
the restrictions we place on the usage of clocks. In many multi-agent timed 
systems it seems sufficient to have just one clock for each agent. For instance, a 
network of timed automata that communicate through shared variables is used 
to model and analyse the timed behaviour of asynchronous circuits by Maler 
and Pnueli [17]. It turns out that product interval automata suffice to represent 
the same class of timed behaviours. 

From a theoretical standpoint, PI automata are strictly less expressive than 
event clock automata due to Alur, Fix, and Henzinger [2] and their state-based 
version [19] which in turn are strictly less powerful than general timed automata. 
As a result, the logics we develop here will also be strictly less expressive than 
the corresponding logics presented in [14] for a generalisation of event recording 
automata called recursive event recording automata. Nevertheless we feel that PI 
automata are of independent interest due to the reasons sketched earlier. For ba- 
sic information about timed automata and their logics we cite the surveys [3,13] 
and their references. An interesting early instance of timed languages which have 



62 



Deepak D’Souza and P. S. Thiagarajan 



nice closure properties and which admit a clean logical characterisation can be 
found in [22]. 

In the next section we define PI automata. In section 3 we show that these 
automata (more precisely their languages) are closed under boolean operations. 
In section 4 we first present a monadic second order logic denoted TMSO® 
to capture the timed regular languages recognised by PI automata. We then 
formulate a linear time temporal logic denoted TLTL® and sketch automata- 
theoretic solutions to the satisfiability and model checking problems for TLTL® 
in terms of PI automata. 

As we point out in the final section, all our ideas can be extended smoothly to 
a larger setting in which the underlying “symbolic” automata are asynchronous 
Buchi automata [11]. The resulting timed automata are called distributed inter- 
val automata and they can also be studied using techniques taken from trace 
theory. It is the case that PI automata are less expressive than distributed inter- 
val automata which in turn are less expressive than event recording automata [2]. 
It turns out that the so called cellular version of distributed interval automata 
correspond to event recording automata. All these extensions as well detailed 
proofs of all the results presented below are available in the full paper [7]. 

2 Product Interval Automata 

We fix a finite set of agents V = {1,2,..., AT} and let i, j range over V. 
A 7^-distributed alphabet is a family S = {AijiG-p where each Si is a finite 
set of actions. We set S = IJiG'P global alphabet induced 

by S. We let a, b range over E. The set of agents that participate in each occur- 
rence of the action a is denoted by loc{a) and is given by: loc{a) = {i | a € Si}. 
Through the rest of the paper we fix such a 7^-distributed alphabet E. 

We let and denote the set of positive and non-negative reals respec- 
tively. Without loss of generality we will use intervals with rational bounds to 
specify timing constraints (and use oo as the upper bound to capture unbounded 
intervals). If an interval is of the form [x, y) or [x, y] we require a; to be a positive 
rational and if an interval is of the form {x,y] or [x,y\ we require ?/ to be a 
positive rational. An interval defines a subset of reals in the obvious way. Let 
INT be the set of all intervals of ^ 

A product interval automaton over A is a structure {{Ai}i^-p,Qm), where 
for each i, Ai is a structure (Qi, — H,Fi,Gi) such that Qi is a finite set of 
states, — >i, the transition relation, is a finite subset of Qi x {Ei x INT) x Qi, 
and Fi,Gi C Qi are, respectively, finitary and infinitary acceptance state sets. 
Qin G Q = Qi X ■ ■ ■ X Qk is a set of global start states. 

For convenience we will be interested only in infinite runs of a PI automaton. 
However in such a run some of the component automata may execute only a 
finite number of actions. It is for this reason we have two types of accepting 
states for each component automaton. 

In what follows, for an alphabet A we will use A* and to denote the set of 
finite and infinite words over A respectively, and let A°° denote the set A* U A‘^ . 



Product Interval Automata: A Subclass of Timed Automata 



63 



For a word cr in A°° , we let prf{a) be the set of finite prefixes of tr. A timed 
word a over E is an element of (A x such that: 

(i) tr is non-decreasing: if r(a, t){b, t') is a prefix of tr then t < t' . 

(ii) if T{a,t)T' {bA') is a prefix of a with t = t', then we must have loc{a) H 
loc{b) = 0. Thus, simultaneous actions are allowed but only if they are 
independent - i.e. their locations are disjoint. 

(iii) if cr is infinite, then it must be progressive: for each t G there exists a 
prefix T{a,t') of a such that t' > t. 

We let denote the set of infinite timed words over E. 

Now suppose AT = 1 so that V = E\. Then the definition above (condition (ii) 
will not apply) will yield the usual notion of a timed word over the alphabet Ei. 
With this in mind, for an alphabet A, we let TA* and TA“ be the set of finite and 
infinite (ordinary) timed words over A, respectively, and set TA°° = TA* U TA“ . 

Let cr S TE'^ . Then cr (f is the i-projection of cr. It is the timed word over Ei 
obtained by erasing from cr all appearances of letters of the form (o, t) with 
a ^ Ei. It is easy to check that cr (f belongsJ;o TEi°° . 

Finally, let r be a finite timed word over E. Then timei(r) is given inductively 
by: timei{e) = 0. Further, timei{T' {a,t)) = t \i a G Ei, and equals timei{T') 
otherwise. Thus timei{r) is the time at which the last Faction took place during 
the execution of r. This notion will play an important role in the rest of the 
paper. We can now define the timed language accepted by a PI autonmton. 

Let A = {{Aiji^'PjQin) be a PI automaton over E and let cr G TE‘^ . Then 
a run of A over cr is a map p : prf{a) Q such that 

b ^ Q in 

2. Suppose T{a,t) is a prefix of cr. Then for each i G loc(a), there exists a 

transition p{t)[i\ p(r(a, t ))[ i ] with {t — timei^r)) G I. Further, for 

each i ^ loc(a) we have p{t)[i\ = p(r(a, t))[i]. 

A run p of A on cr is accepting iff for each i G V: 

(i) If cr ( z is finite then p(r) [z] G Fi where r is a prefix of cr such that r f z = cr ( z. 

(ii) If cr (z is infinite then p(r)[z] G Gi for infinitely many r G prf(a). 

We set L{A) to be the set of words in accepted by A (i.e. those on 
which A has an accepting run). 

We can solve the emptiness problem for PI automata by reducing it to the 
emptiness problem for timed Biichi automaton (TBA) as formulated by Alur 
and Dill [1]. The concerned timed simulation is the obvious one and the details 
can be found in [7]. In going from a PI automaton to a timed automaton, there 
will be a blow-up in the size of the automaton. 

3 Closure under Boolean Operations 

Here we wish to show that the class of languages accepted by PI automata 
(over E) is closed under boolean operations. Using the notion of proper interval 



64 



Deepak D’Souza and P. S. Thiagarajan 



sets, we will transform the relevant problems to the domain of ordinary trace 
languages. We can then apply known constructions involving trace languages to 
obtain the desired results. 

It will be useful to first study a single component of a PI automaton. To this 
end, let ^ be a finite alphabet of actions. Then an interval alphabet based on A 
is a finite subset of A x INT. Let F be an interval alphabet based on A. Then 
an interval automaton over T is a structure B = (Q, — >,Qin, F,G) where Q is 
a finite set of states, — i-C Q x F x Q is a transition relation, Qin C Q is a 
set of initial states, F C Q is a set of finitary accepting states and G C Q is 
a set of infinitary accepting states. L{B) C TA°° , the language of timed words 
accepted by B is now defined in the obvious way. (To be pedantic, one can 
set V = {1} and Fi = A and apply the definitions from the previous section.) 
There is also a symbolic language Lsym{B) C F°° that one can associate with B 
in the classical manner. The finite words in LgymiB) are accepted using members 
of F and the infinite words are accepted by viewing G as a Biichi condition 
(see [21] for a standard treatment). It is also the case that there is a natural 
map tw : F°° TA°° such that tw{Lsym{B)) = L{B). The map tw can be 
defined as follows: 

Let a € F°° and cr G TA°° . Then a G twia) iff jcrj = |ct| and for each 
prefix T{a,t) of cr and each prefix T{b,I) of a with jrj = jrj we have a = b and 
t— time(r) G F The function time is of course given inductively by time{e) = 0 
and time{T'{a',t')) = t' . As usual, for L C F°° we set tw{L) = tw(a). To 

sum up, the simple observation we wish to start with is: 

Proposition 1. Let B be an interval automaton over the interval alphabet F 
(based on A). Then L{B) = tw{Lsym(B)). □ 

Now we turn to proper interval sets. We say that X C INT is proper iff X is a 
finite partition of (0, cx)). An interval alphabet F based on A is said to be proper 
iff for each a in A, the set of intervals Fa is proper where Fa = {I \ (a, I) G F}. 
The next observation which is easy to verify is crucial for our purposes. 

Proposition 2. Let F he a proper interval alphabet based on A and let a G 
TA°° . Then there exists a unique word a G such that a G tw{a). □ 

The uniqueness of a as stated in the above result is the key to reducing the 
study of timed languages to the study of trace languages in the present setting. 
This can be achieved by embedding interval alphabets into proper interval al- 
phabets. To formalise this idea, let X,I' C INT. Then I' is said to cover X 
iff each interval in X can be expressed as the union of a set of intervals taken 
from X' . Clearly the choice of such a set of intervals (for a given member of X) 
will be unique in case X' is a proper interval set. 

Let F and F' be two interval alphabets based on the alphabet A. Then F' 
covers F iff F( covers Fa for each a G A. 

We will say that L C is a regular interval language over A iff there 

exists an interval alphabet F based on A and an interval automaton B over F 
such that L = L{B). For convenience we will often say “interval language” to 
mean “regular interval language.” 



Product Interval Automata: A Subclass of Timed Automata 



65 



Proposition 3. Let A he a finite alphabet of aetions. 

(i) Suppose r is an interval alphabet based on A. Then one ean effectively 
construct a proper interval alphabet T' based on A such that T' covers T. 

(ii) The class of interval languages over A is closed under boolean operations. 

Proof sketch, (i) follows from the observation that given an interval set X = 
{/i, . . . , /m} we can form the set AiU{0,oo} = X' where X is the set of rational 
numbers that appear as bounds of the intervals in X. Let X' = {0, xi, . . . , oo} 
with 0 < xi < a ;2 < • • • < < oo. Then X' = {(0, a;i), [si, xi], (xi, 0 : 2 ), 

[x 2 , X 2 ], . . . , (x„, 00 )} is a proper interval set which covers X. 

To prove (ii), one first observes that closure under union is obvious. So let B 
be an interval automaton over the interval alphabet T. Then we can construct 
r' such that T' is proper and covers T. Let B = {Q, — >,Qin,F,G). Then we 
construct the interval automaton B' = F,G) over F' where =^>C 

QxF' xQ is given by: q q' iff there exists q q' in B such that I' C I. It is 
easy to verify that L{B) = L{B'). Using classical techniques we can now find an 
interval automaton B" over F' such that Lsym,{B") = {F')°° — Lsym,{B'). Since F' 
is proper we can now use proposition 2 to conclude that L{B") = TA°° — L{B'). 
□ 



The technique used in the second part of the proof is deployed in different 
contexts to establish many of our results. It is for this reason we have presented 
some of the key details in a very simple setting. 

We can now address the main concerns of this section. To show closure 
under boolean operations of the languages accepted by PI automata, we will 
first characterise these languages using interval languages. Let {Li} be a fam- 
ily of languages such that Li C TEi°° for each i. (Recall E = {Ei}i^-p.) 
Then ®(Li, . . . , Lk) denotes the language L C defined as: 

a G L IS a \ i G Li for each i . 

We will say that L C TE‘^ is a regular direct product interval language iff 
there exists a family {Li} such that each Li is a regular interval language over Ei 
and L = 0(Li, . . . , Lx). A regular product interval language over A is a finite 
union of regular direct product interval languages over E. Again, for convenience, 
we will often say “product interval language” to mean “regular product interval 
language” etc.. 

Theorem 1. Let L C TE^ . Then L is a product interval language iff there 
exists a PI automaton B over E such that L = L{B). □ 

Using the theory of product languages [15], which is a natural subclass of 
Mazurkiewicz trace languages and the technique used in the proof of Proposi- 
tion 3 we can now show: 

Theorem 2. The class of product interval languages over E is closed under 
boolean operations. 



66 



Deepak D’Souza and P. S. Thiagarajan 



4 A Logical Characterisation of Product Interval 
Languages 

Our goal here is to first develop a logical characterisation of product interval 
languages in terms of a monadic second order logic. We shall then present a 
related linear time temporal logic for reasoning about timed behaviours captured 
by PI automata. As done in the previous section it will be convenient to first 
concentrate on the one component case. 

Let A be a finite alphabet of actions. Here and in the logics to follow, we 
assume a supply of individual variables x,y, . . . and set variables A, Y, . . .. For 
each a G A we have a unary predicate Qa- There are also predicates of the form 
A{x,I) in which x is an individual variable, and / is a member of INT. 

The syntax of the logic TMSO(A) is given by: 

TMSO(A) ::= xGX\x<y \ Qa(x) \ A{x, I) \ (p \/ ip' \ ^ Lp \ 

{3x)p I {3X)p. 

A structure for this logic is a pair (u, I) where cr G TA°° and I is an interpre- 
tation which assigns to each individual and set variable, an element and subset, 
respectively, of the set pos{a). We define pos{a) to be the set {1, 2, . . . , |cr|} 
in case cr is finite, and {1, 2, ...} if cr is infinite. As is customary, cr can be also 
viewed as a map from pos{a) into A with cr(n) for n G pos{a) having the obvious 
meaning. We will use this view of cr whenever convenient. Turning now to the 
semantics, let (cr, I) be a structure. Then a \=i{x < y) iff I(x) <' I(j/). (Here <’ 
is the usual ordering over the integers). Further, cr Qa{x) iff cr(I(x)) = (a,t) 
for some t. As for the predicate A{x,I), we have 

a A{x,I) iff t — time{T) G I, where T{a,t) is the prefix of cr s.t. 
|r(a,t)| = I(x). 

The semantics of the logical connectives ~, and V, and the existential quantifiers 

and 3X are given in the expected manner. For a sentence p G TMSO(A) we 
set L{p) = {cr G TA°° I cr \= p\. 

We can now use the classical logical characterisation of w-regular languages 
and the notion of proper interval alphabets to establish: 

Theorem 3. Let L C TA°° . Then L is an interval language iff there exists a 
sentence p G TMSO(A) such that L = L{p). 

Next we wish to present the logic TMSO® (A) which will characterise product 
interval languages. The syntax of the logic is given by: 

TMSO® (A) ::= p{i) \ ~a|crV/3|aA/3 

where for each formula p{i) we require p G TMSO(Ai). 

A structure for the logic is a pair (cr, {Ij}ig-p), where cr G TS^ and for each 
z G 7^, li is an interpretation for individual and set variables over the set of 
positions pos{<j (z). As for the semantics, 

O' hfiilisT’ Tii) iff erf z Hi T- 




Product Interval Automata: A Subclass of Timed Automata 



67 



The boolean operators V and A are interpreted in the usual manner. Once 
again, for a sentence ip S TMSO®(i7), we set L{p) = {cr S \ a \= p}. 

Theorem 4. Let L C TS^ . Then L is a produet interval language iff L = L{p) 
for some sentence p G TMSO®(i7). 

This result is established by exploiting Theorem 1 and yet again the notion 
of proper interval alphabets. 

As a natural temporal logic counterpart of TMSO®(i7), we present here the 
logic TLTL®(A). Again, we will build up the formulas in two steps. For ease of 
presentation, we will not deal with atomic propositions here. Where necessary, 
they can be introduced component-wise and dealt with easily. 

TLTL(Ai) ::= T I ^p\pV p' \{a, I)p \ Op \ pUp' . 

For each formula of the form (a, I)p, we require that a G Ei. 

The formulas of TLTL®(A) are simply boolean combinations of TLTL(i7i)- 
formulas: 

TLTL®(i7) ::= p(i) | ~ a | a V /3. 

Here we require from each formula of the form p(i) that p G TLTL(Ai). 

The semantics of this logic is given by defining the relation cr |= a with 
a G TE^ . To spell out the details, a |= p{i) lE a \i,e\=i p. The relation \=i is 
defined in the obvious way. For instance, 

cr, T \=i {a,I)p iff there exists t such that T{a,t) G prf{a) and 
cr, r(a, t) \=i p and t — time (r) G I. 

On the other hand cr, r \=i Op iff there exists T{a,t) G prf{a) such that 
<7,T{a,t) \=z P- 

For each a in TLTL®(A) we define L{a) = {cr G 7127“ | a ^ a}. Clearly, a 
is satisfiable iff L{a) is non-empty. 

Theorem 5. For each a in TLTL®(27) we can effectively construct a PI au- 
tomaton Aa such that L{a) = L{Aa) and the number of global states of Aa 
is Further, the largest constant mentioned in the i*'^ component of Aa is 

at most the largest i-type constant mentioned in a. 

Proof sketch. The proof follows the usual automata-theoretic techniques. To 
account for negation correctly, the local transition relations of Aa must be over 
proper alphabets. The obvious idea of converting a into a formula which men- 
tions only intervals taken from a proper interval set will cause an exponential 
blow-up. To avoid this, we define the (Fisher-Ladner) closure of a to be smallest 
set of formulas containing the subformulas of a and satisfying: If {pUp'){i) 
is in the closure then so is 0{{pUp' ){i)). Let CL be the closure of a and 
CLi = CL n TLTL(27i) for each i. We now define an Ftype atom to be a 
propositionally consistent subset of CL^ which satisfies: If {a,I)p and {b,I')p' 



68 



Deepak D’Souza and P. S. Thiagarajan 



are in the atom then a = b and I O I' ^ 0. We can now use i-type atoms to 
manufacture the i-local states. The i-local transition relation can then be chosen 
to be proper. The remaining details are guided by the way the logic LTL®(Z’) 
is dealt with in [20]. □ 

Using theorem 5, and the emptiness check for PI automata in section 2, we 
can decide satisfiability of a in time 2*^(1“!+^ In fact, we can show: 

Corollary 1. The satisfiability problem for TLTL^ (U) is PSPACFi-complete. 

Given a formula a in TLTL®(Z’) we can check the emptiness of the automa- 
ton Aa of theorem 5 non-deterministically while using space polynomial in the 
size of a. The argument is similar to the one in [1]. The fact that the problem 
is PSPACE-hard follows from the observation that the satisfiability problem 
for LTL can be reduced to the satisfiability problem for the one-agent case of 
TLTL®(A). 

Next suppose we consider a real-time program Pr modelled by a PI au- 
tomaton Apr, and a formula a of TLTL®(A). Then Pr is said to meet the 
specification a iff L{Apr) C L{a). The model checking problem for TLTL®(i7) 
is to determine whether Pr meets the specification a. It is not difficult to show: 

Theorem 6. The model checking problem for TLTL®(A) is PSPACE complete. 

We have not studied in detail the power and limitations of TLTL®(A) from 
a pragmatic standpoint. We have explored TLTL®(A) here mainly because it is 
the natural temporal logic suggested by TMSO®(A), the characteristic logic for 
PI automata. In fact, it is not difficult to show that TLTL®(27) is expressively 
equivalent to the first-order fragment of TMSO®(A). 

In order to provide some example properties that can be stated in this timed 
temporal logic, we first sketch briefly how product interval automata can be used 
to model asynchronous digital circuits. (The full details will appear elsewhere.) 
Our starting point is the circuit model developed in [17] using inertial delays. 
The behaviour of a k-g&ie circuit y, denoted T(y), can be defined as a subset 
of where S is the alphabet {0,1}^ x {0,1}^. An element of {0,1}^ will 

represent the visible values of the gate outputs. The occurrence of an action (s, s') 
will indicate the visible values of the gate outputs going from s to s' due to the 
simultaneous switching of an appropriate set of gates. We can associate with y, a 
(fc-l- l)-component product interval automaton A^ such that L{Ax) = T(y). We 
use one component for each gate in the circuit. This component will keep track of 
the switching status of the gate (whether excited or quiescent); and, in case it is 
in an excited state, the time elapsed since its excitement was initiated last. The 
{k + l)-th component keeps track of the current vector of visible values. Suppose 
its state is s and it permits an action (t, fi) at s, then it will be the case that 
s = t. It will also be the case that the circuit allows the transition from t to t' 
through the switching of a subset of gates that are excited at t. Furthermore, the 
distribution of actions across the components is such that the action {t,t') will 
involve - apart from the component fc-|-l - all those components whose switching 



Product Interval Automata: A Subclass of Timed Automata 



69 



status is affected by the switches. As for timing constraints, the component i 
will have timing constraints that enforce the inertial delay corresponding to the 
output wire of gate-z, while the (fc + l)-th component will be free of any timing 
constraints. 

We can use the logic TLTL® (A) to express (and hence model-check) a variety 
of properties of y. Two examples of such properties are: 

— If the input signals to y stabilise then the behaviour of y eventually sta- 
bilises. 

— gate i is d-persistent; whenever it becomes excited it turns quiescent only 
through its switching and further more this switching occurs within d time 
units since it got excited. 



5 Distributed Interval Automata 

Here we consider a more expressive class of distributed timed automata but 
having the same flavour as PI automata. The extension we have in mind will 
parallel the extension of product automata to asynchronous automata in the 
setting of traces as detailed in [15]. 

We shall continue to work with the 7^-distributed alphabet E. An a-interval 
for us will be a map J : loc{a) INT. Let be the set of a-intervals. An a- 

interval, viewed as a rectangular region, will constitute the guard constraining 
the timed occurrence of an a-action. 

In the definition of our automata and elsewhere we will often write {Xi} 
to denote the family {Xi}i^-p. A distributed interval automaton over 17 is a 
structure 

= ({Qj}j { Qim {Pi, Gi}) 

where the various components of A are defined as follows. While doing so we 
will also develop some notations. 

(i) Each Qi is a finite non-empty set called the set of z-local states. Let P be 
a non-empty subset of V. Then a P-state is a map q : P ^ Uie-p Qi 
that q{i) € Qi for each i £ P. An a-state is just a Zoc(a)-state and a P-state 
will be called a global state. We let Qa denote the set of a-states and Q-p 
the set of global states. 

(ii) — >a is a finite subset of Qa x x Qa for each a G E. 

(iii) Qin C Qp is the set of global initial states. 

(iv) Fi,Gi C Qi for each i. 

Thus a distributed interval automaton is a timed version of an asynchronous 
Biichi automaton as formulated in [18] which in turn is a minor variant of the 
original formulation due to Gastin and Petit [11]. If one ignores the acceptance 
conditions then an asynchronous automaton can be viewed as a labelled 1-safe 
Petri net. Consequently a distributed interval automaton can also be viewed as 
a timed 1-safe Petri net. However, as pointed out in the introduction, the timed 



70 



Deepak D’Souza and P. S. Thiagarajan 



semantics we assign to our automata is not the semantics one encounters in the 
literature concerning timed Petri nets [5]. 

We can now systematically extend the theory of PI automata - as sketched 
in the previous sections - to distributed interval automata. The details can be 
found in [7]. 

It turns out that the so called cellular version of distributed interval au- 
tomata, has the same expressive power as the event recording automata [2]. In 
this version of asynchronous automata there will be one component for each 
letter in E and one clock associated with each component. The interesting fact 
is that in the untimed setting both versions of asynchronous automata have the 
same expressive power. Once again we refer the reader to [7] for the details. 

References 

1. R. Alur, D. L. Dill: A theory of timed automata, Theoretical Computer Science 
126: 183-235 (1994). 60, 61, 63, 68 

2. R. Alur, L. Fix, T. A. Henzinger: Event-clock automata: a determinizable class of 
timed automata, Proc. 6th International Conference on Computer-aided Verifica- 
tion, LNCS 818, 1-13, Springer- Verlag (1994). 61, 62, 70 

3. R. Alur, T. A. Henzinger: Logics and Models of Real Time: A Survey, in Real-Time: 
Theory in Practice, J. W. de Bakker, H. Huizing, W. -P. de Roever, G. Rozenberg 
(Eds.), LNCS 600, 74-106, (1992). 61 

4. J. Bengtsson, B. Jonsson, J. Lilius, W Yi: Partial Order Reductions for Timed 
Systems, Proc. CONCUR ’98, LNCS 1466 (1998). 60, 61 

5. B. Berthomieu, M. Diaz: Modelling and Verification of Time Dependent Systems 
Using Time Petri Nets, IEEE trans. on Soft. Engg. Vol 17, No. 3, March 1991. 70 

6. V. Diekert, G. Rozenberg: The Book of Traces, World Scientific, Singapore (1995). 
61 

7. D. D’Souza, P. S. Thiagarajan: Distributed Interval Automata, Internal 
Report TCS-98-3, Chennai Mathematical Institute (1998). (available at 
http://www.smi.ernet.in/techreps/) 62, 63, 70, 70 

8. W. Ebinger, A. Muscholl: Logical definability on infinite traces. Theoretical Com- 
puter Science 154: 67-84 (1996). 

9. T. A. Henzinger, P. W. Kopke, A. Puri, P. Varaiya: What’s decidable about hybrid 
automata?, Proc. 27th Annual Symposium on Theory of Computing, 373-382, ACM 
Press (1995). 

10. M. Minea: Partial Order Reduction for Model Checking of Timed Automata. To 
appear in the Proceedings of CONCUR’99. 60, 61 

11. P. Gastin, A. Petit: Asynchronous cellular automaton for infinite traces. Proceed- 
ings ofICALP ’92, LNCS 623, 583-594 (1992). 62, 69 

12. R. Gerth, D. Peled, M. Vardi, P. Wolper: Simple On-the-fly Automatic Verification 
of Linear Temporal Logic. Proc. 15th IFIP WG 6.1 Int. Workshop on Protocol 
Specification, Testing, and Verification. North-Holland Publ. (1995). 

13. T. A. Henzinger: It’s About Time: Real-Time Logics Reviewed, Proc. CONCUR 
’98, LNCS 1466, 366-372 (1998). 61 

14. T. A. Henzinger, J.-F. Raskin, and P.-Y. Schobbens: The regular real-time lan- 
guages, Proc. 25th International Colloquium on Automata, Languages, and Pro- 
gramming 1998, LNCS 1443 , 580-591 (1998). 61 



Product Interval Automata: A Subclass of Timed Automata 



71 



15. J. G. Henriksen, P. S. Thiagarajan: A Product Version of Dynamic Linear Time 
Temporal Logic, Proc. CONCUR ’97, LNCS 1243, 45-58, (1997). 65, 69 

16. N. Klarlund, M. Mukund, M. Sohoni: Determinizing Biichi Asynchronous Au- 
tomata, Proceedings of FSTTCS 15, LNCS 1026, 456-470 (1995). 

17. O. Maler, A. Pnueli: Timing Analysis of Asynchronous Circuits using Timed Au- 
tomata, in Proc. CHARME ’95, LNCS 987, 189-205 (1995). 61, 68 

18. M. Mukund, P. S. Thiagarajan: Linear Time Temporal Logics over Mazurkiewicz 
Traces, Proc. MFCS 96, LNCS 1113, 62-92 (1996). 69 

19. J. -F. Raskin, P. -Y. Schobbens: State-clock Logic: A Decidable Real-Time Logic, 
Proc. HART ’97: Hybrid and Real-Time Systems, LNCS 1201, 33-47 (1997). 61 

20. P. S. Thiagarajan: A Trace Consistent Subset of PTL, Proc. CONCUR ’95, 
LNCS 962, 438-452 (1995). 68 

21. W. Thomas: Automata on Infinite Objects, in J. V. Leeuwen (Ed.), Handbook of 
Theoretical Computer Science, Vol. B, 133-191, Elsevier Science Publ., Amsterdam 
(1990). 64 

22. Th. Wilke: Specifying Timed State Sequences in Powerful Decidable Logics and 
Timed Automata, in Formal Techniques in Real- Time and Fault- Tolerant Systems, 
LNCS 863, 694-715 (1994). 62 

23. G. Winskel, M. Nielsen: Models for Concurrency, in S. Abramsky, D. Gabbay (Eds.) 
Handbook of Logic in Computer Sc., Vol. 3, Oxford Univ Press (1994). 

24. W. Yi, B. Jonsson: Decidability of Timed Language-Inclusion for Networks of Real- 
Time Communicating Sequential Processes, in Proc. FST&TCS 94, LNCS 880 
(1994). 61 



The Complexity of Rebalancing 
a Binary Search Tree 



Rolf Fagerberg* 

BRIGS, Department of Computer Science, Aarhus University, 
DK-8000 Arhus C, Denmark 
rolfSbrics .dk 



Abstract. For any function /, we give a rebalancing scheme for binary 
search trees which uses amortized 0{f{n)) work per update while main- 
taining a height bounded by [log(n + 1) -|- l/f{n)]. This improves on 
previous algorithms for maintaining binary search trees of very small 
height, and matches an existing lower bound. The main implication is 
the exact characterization of the amortized cost of rebalancing binary 
search trees, seen as a function of the height bound maintained. We 
also show that in the semi-dynamic case, a height of [log(n + 1)] can be 
maintained with amortized O(logn) work per insertion. This implies new 
results for TreeSort, and proves that it is optimal among all comparison 
based sorting algorithms for online sorting. 



1 Introduction 

The binary search tree is one of the fundamental data structures of computer 
science. Its importance lies in its ability to maintain a set in sorted order during 
insertions and deletions, while supporting a wide range of search operations on 
the elements. It has a vast number of applications, and most undergraduate 
textbooks on algorithms devote an entire chapter to it. 

Much of the research on binary search trees has been centered around re- 
balancing schemes, and numerous such schemes exist which keep a height of 
c • log(n), for some constant c > 1, while supporting updates in O(logn) time. 
Some of the more well known examples include AVL-trees [1] with c = 1.44, red- 
black trees [9] with c = 2, and BB{a)-trees [7,14] with 2 < c < 3.45 (depending 
on a), but there are many more. 

The trivial lower bound on the height of a binary tree with n nodes is 
[log(n -b 1)] . While being within a constant factor of this lower bound may 
be sufficient for most practical purposes, a natural theoretical question is, ex- 
actly how close to this optimum we can keep the height. Presumably, the answer 
depends on how much rebalancing we are willing to do after an insertion or dele- 
tion of a key - for instance, using 0(n) work per update, the trivial lower bound 

* Supported by the Danish National Science Research Council (grant no. 11-0575) 
and by the ESPRIT Long Term Research Programme of the EU under project no. 
20244 (ALCOM-IT). This research was done at Department of Mathematics and 
Computer Science, University of Southern Denmark, Odense, Denmark. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 72—83, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



The Complexity of Rebalancing a Binary Search Tree 



73 



is attainable by simply rebuilding the entire tree after each update. Thus, in a 
more general form, the question is: 

Given a function f , what is the best possible height maintainable with 

0{f{n)) rebalancing work per update? 

Partial answers to this question exist. In particular, upper bounds better 
than those mentioned above have been given. Already in 1976, Maurer et al. 
presented a rebalancing scheme [13] with a height bound of c • log(n), where c 
can be chosen arbitrarily close to one. The rebalancing cost is O(logn), with the 
constant depending on c. The gap was further closed around 1990, when Anders- 
son and Lai in their theses [3,11] and in resulting papers [2,4,5,6,12] gave a series 
of schemes for maintaining height [log(n + 1)] + 1, using amortized 0(log^ n) 
rebuilding work for the simplest, improving to 0(1) for the most complicated. 

These very positive results are strongly contrasted by an observation of 
Lai [11], which shows that f2(n) rebuilding work per update is necessary, even 
in the amortized sense, for keeping optimal height [log(n + 1)] for all n during 
insertions and deletions. A generalization of this lower bound has been given 
in [8], where it is shown that for any function /, the maintenance of height 
[log(n + 1) + l//(n)] requires l7(/(n)) amortized rebuilding work per update. 

In this paper, we give a matching upper bound by showing how to maintain 
height [log(n+ 1) + l//(n)] for all n, in amortized 0(/(n)) rebuilding work per 
update. 

Taken together, these results provide an exact answer to the question above 
in the case of amortized complexity - namely a height of 

[log(n+ 1) + 6>(l//(n))]. 

This expression may be seen as describing the intrinsic amortized complexity 
of rebalancing a binary search tree. 

The importance of the new upper bound is of course not of a practical nature 
- in few situations does it matter whether the height bound is [log(n + 1) + 1] 
or [log(n + 1) + o(l)]. Rather, its main virtue is the final settling of the exact 
amortized cost of rebalancing a binary search tree. 

Other implications exist, though. As a corollary to our new upper bound, 
we prove that in the semi-dynamic case (i.e. insertions only), optimal height 
[log(n -|- 1)] can be maintained at an amortized cost of O(logn) per insertion. 
This improves the previous best upper bound of 0(log^ n) from [3], and matches 
a lower bound in [8]. 

This corollary also implies improved results for TreeSort (i.e. sorting by re- 
peated insertions into a binary search tree). The best possible bound, 

n— 1 

[logn -I- 1] = n[logn] — 21^'°®"^ -|- 1, 

i=0 

on the number of comparisons for TreeSort can now be achieved in 0(n log n) 
time, as opposed to 0(n log^ n) before. For online sorting, where elements arrive 



74 



Rolf Fagerberg 



one at a time, and the set has to be sorted at all times, this is best possible for 
comparison based sorting algorithms ([10], page 184 and 204). Thus, TreeSort is 
optimal (with respect to the number of comparisons and to the actual time) for 
online sorting in the comparison model. 

2 Definitions 

In this paper, the subject is rebalancing in standard binary search trees. For 
technical purposes, we will also be dealing with unary-binary search trees. ^ We 
define both types here: A binary node has a (possibly empty) left and a (possibly 
empty) right subtree, and contains one key. A unary node has one (possibly 
empty) subtree and contains no key. A binary search tree is a tree containing 
only binary nodes. A unary-binary search trees is a tree that may contain both 
types of nodes, but with the restriction that all empty subtrees have the same 
depth. In both types of trees, keys are distributed in an in-order fashion. For 
examples of unary-binary trees, see the figures later in this paper. 

The size of a search tree is its number of binary nodes (i.e. its number of 
keys). The height of a tree is defined as zero for empty trees, and as one plus 
the height of the tallest subtree of the root, otherwise. The level of a node is the 
height of the subtree in which it is the root. The rebalancing cost of an update 
is the number of pointers which are changed in order to rebalance the tree after 
the update.^ 

3 New Results 

Theorem 1. Let k be a positive integer, let e be a real number between 0 and 1/2, 
and let N denote 2^ — 1. There exists a rebalancing scheme for doing insertions 
in a binary search tree while its size grows from [(1 — e)A^J to [(1 — e/2)iV], 
which guarantees optimal height k in the tree after each insertion, and which 
has an amortized rebalancing cost o/0(l/e) per insertion (including the cost of 
initializing the scheme). 

Theorem 1 is the main result of this paper. It meets a lower bound from [8] 
stating that in any binary search tree of size (1 — e)(2^ — 1) and optimal height k, 
an insertion can be made such that the rebalancing cost of restoring optimal 
height is I7(l/e). Essentially, these results describe how the amortized cost of 
maintaining optimal height [log(n -1-1)] in a binary search tree changes when 
the free room gets scarce - i.e. when n approaches the next power of two. 

Theorem 1 is proven in Sects. 4-6. For the remainder of this section, we show 
how it can be transformed into the results stated in the introduction. 

^ Some subclasses of unary-binary trees were studied in the late 70’s under the name 
brother trees (see e.g. [15] and the references in it). 

^ Assuming that keys stay in the same nodes. If keys are interchanged between nodes 
during the rebalancing, we also count the number of moved keys (or view inter- 
changes of keys as pointer changes - the count is the same up to a constant factor) 



The Complexity of Rebalancing a Binary Search Tree 



75 



In the first corollary below, we restrict ourselves to functions / which are 
non-decreasing, positive functions, for which there exists an integer m such that 
f{2n) < 2f{n) for all n > m. We call such functions smooth. Clearly, any smooth 
function must be in 0(n), but this is no restriction, as f € f?(n) means that 
the entire tree can be rebuilt after each update, making Corollary 1 obvious. 
Basically all standard functions in 0(n) are smooth, including log^(n) for all k 
and rf for all e < 1. 

Corollary 1. For any smooth function f , there exists a rebalancing scheme for 
doing insertions and deletions in binary search trees at an amortized rebalancing 
cost of 0{f{n)) per update, while keeping the height bounded by |"log(n -|- 1) -I- 
l//(n)] at all times. 

Proof. From the rebalancing scheme in Theorem 1, we first construct a scheme 
fulfilling the statement in the corollary for insertions only. We consider, for any k, 
how to insert 2^“^ elements while n grows from = 2^“^ — 1 to A ^2 = 2^ — 1. 
Let Co = l//(iV 2 ), and let i be the smallest integer such that 1/2® < eoln(2)/2. 
As n grows from Ni to (1 — l/2®)iV2, we employ the scheme from Theorem 1, 
changing e when necessary. This keeps a height of k at an amortized rebuilding 
cost of O(l/eo) per insertion. As n grows from (1 — 1/2®) A ^2 to N 2 , we employ 
Anderssons and Lais best scheme from [6] , which keeps a height of fc -|- 1 at an 
amortized price of 0(1). The change of scheme requires an 0(n) time global 
rebuilding, but this is of no concern when dealing with amortized complexity. 

Using first order approximation to the logarithm function and its convexity, 
it can be verified that the height is bounded by [log(n -|- 1) -I- eo] at all times. 
Thus, we maintain height [log(n-|- 1) -I- l//(n)] in amortized time 0(/(n)), as / 
is smooth. 

This rebalancing scheme can now be made fully dynamic by simply marking 
nodes as deleted, employing the above scheme with a function f{n) = 2f{n), and 
each 0{n/ f{n)) updates rebuilding the entire structure while removing marked 
nodes. This is still amortized 0{f{n)) work per update, and using first order 
approximation to the logarithm function it can be verified that the height is 
bounded by [log(n -|- 1) -|- l//(n)] at all times. Alternatively, deletions can be 
done by a procedure similar to the insertion algorithm described later in this 
paper (by pushing extra unary nodes upwards, instead of pulling missing unary 
nodes downwards). □ 



Corollary 2. There exists a rebalancing scheme for doing insertions only in 
a binary tree at an amortized rebalancing cost of 0(log n) per insertion, while 
keeping the optimal height [log(n -|- 1)] at all times. 

Proof. For any k, we insert 2^~^ elements while increasing the size from 2^~^ — 1 
to 2^ — 1 by using the scheme from Theorem 1 with e = 1/2® over the next 2^'“^“® 
insertions, for z = 1, 2, . . . , fc— 1. Summing over the 2^~^ insertions, a total work 
of 0{k2^) can be calculated. This implies the result. □ 



76 



Rolf Fagerberg 



4 Initialization 

Our goal is to keep optimal height in a binary search tree during insertions. If the 
next insertion increases the height, some rebalancing must be done. Intuitively, 
this rebalancing process can be seen as one of moving a “hole” from somewhere 
else in the tree to the insertion point. 

Still on the informal level, a key idea utilized in our rebalancing scheme is 
the observation that “rows of holes” are cheaper to move than “single holes” . 
The figure below illustrates the point. Only edges incident to the marked nodes 
need to be changed. 




* 




Loosely speaking, a row of 0{r) holes can travel past 6>(s) keys in 0{s/r) 
pointer changes. The central feature of our rebalancing scheme is an initial dis- 
tribution of holes which allows the observation above to be exploited for an 
extended sequence of insertions at low amortized cost. 

To formalize the ideas above, we use a natural mapping ip from unary-binary 
trees of height k to binary trees of height at most k: To each unary-binary tree T 
we associate a corresponding binary tree ipiT) by simply contracting all unary 
nodes. The following figure shows a unary-binary tree T and its corresponding 
binary tree '4’iT). Unary nodes are square, binary nodes circular. 




We will use unary-binary trees to describe the initial set-up as well as the 
rebalancing operations. This is purely for ease of notation - the actual trees 
we are working on are standard binary trees. As for implementation, the unary- 
binary tree can be used as an information structure for rebalancing in the actual 
binary search tree. It should be clear that the rebalancing cost in the unary- 
binary tree T is at least as large as the rebalancing cost in the corresponding 
binary tree f’iT). 

If we define the weight of a unary node at level h to be 2^“^, the following 
holds. 



Lemma 1. In a unary-binary tree T of height k, the number of binary nodes 
plus the total weight of the unary nodes sum up to 2^ — 1. 




The Complexity of Rebalancing a Binary Search Tree 



77 



Proof. Consider building T from a complete binary tree of 2^ — 1 nodes and 
height k by changing the appropriate binary nodes into unary nodes, one level 
at a time in a top-down fashion. When a binary node at level h is changed into a 
unary node, exactly nodes disappear from the tree. From this, the lemma 
follows. □ 

Now let k and e be given, and set N = 2^ — 1. We will build a specific initial 
tree of size (1 — e)N and height k. By Lemma 1, our task is to distribute a total 
weight of eN in a unary-binary tree of height k. 

We partition this total weight into three parts: A reservoir of approximately 
three fourths of the total weight, a set of layers having less than one eighth of 
the total weight, and the rest, which is just whatever weight remains when the 
two first parts have been allotted. 

The reservoir is distributed as unary nodes at a certain level /imax- The layers 
consist of unary nodes at odd levels less than hmax- Finally, the rest is unary 
nodes at level one. The underlying idea is that the layers will allow unary nodes to 
propagate from the reservoir to the insertion point at sufficiently low amortized 
cost, while the reservoir is big enough to sustain this for the requested number 
of insertions. 

We now describe how to build the initial unary-binary tree in a top down 
fashion. We define 

hmax = 2[(fc-blog(e))/3j -5. 

Above this level, we place no unary nodes. Hence, the top of the tree consists 
of a complete binary tree of height k — /imax -I- 1. At level hmax, we make R = 
[(3eA^/4)/2^“““^] = |’3eA^/2^”'**+^] of the nodes unary. Their distribution on 
the level will not be significant. These unary nodes constitute the reservoir. 

The unary nodes belonging to the layers are distributed on levels hmax ~ 2, 
hmax— 4, etc. At each of these levels, we divide the nodes (unary as well as binary) 
into groups. Each group constitutes a consecutive row of nodes, when going from 
left to right on that level. We see level hmax as a single group, consisting of all 
the nodes on that level. The rest of the tree is now constructed in a top-down 
fashion as follows. 

Given a group on level h, we add a level of binary nodes just below it. This 
determines the number x of nodes on level h — 2. These nodes are now divided 
into eight groups as evenly as possible, i.e. into groups of sizes [a:/8j and [s/S]. 
In each of these groups, four of the nodes are made unary. Their distribution 
inside the group is not significant. The figure below illustrates the process for a 
group of 15 nodes. The first node in each group is indicated by a mark. 



□□□□ooooooooooo 

* 



... 4 groups . . . 



78 



Rolf Fagerberg 



Eight new groups on level h — 2 have been created, and the process is now 
repeated recursively with these. The recursion stops when the groups on level 
one have been created, i.e. after [{k + log(e))/3j — 3 steps of the recursion. A 
sketch of the resulting tree looks like this: 



Layers 




Rest 

{not shown) 



Only unary nodes are shown. The horizontal dashed lines indicates levels in 
the tree, and the short vertical lines are group borders. The groups above and 
below the vertical ellipsis are not to scale. For clarity, in the figure each group 
only spans two groups immediately below it - the actual number is eight. 

The weight limit has not been exceeded by the above construction: 

Lemma 2. The total weight of the unary nodes in the reservoir and in the layers 
is less than eN. 

Proof. Summing up this weight is rather straightforward. The details can be 
found in the full paper. □ 

Whatever weight remains after having built the tree above, is simply dis- 
tributed as unary nodes on level one. Their exact distribution will not be signif- 
icant. 

Later, we need the following fact: 

Lemma 3. When the initial tree has been built, the groups on level l + 2i each 
contain between and nodes (when k is sufficiently large). 

Proof. The proof proceeds by estimating the group sizes at each level in a top- 
down fashion. The details can be found in the full paper. □ 

5 Rebalancing 

We describe all operations as taking place in unary-binary trees. As usual, an 
insertion starts by a search down the tree using the in-order ordering of the keys 




The Complexity of Rebalancing a Binary Search Tree 



79 



present. Unary nodes contains no keys, and are just passed through. The search 
stops when an empty tree is encountered. If the father of the empty tree is unary, 
we simply make it binary and deposit the new key in it. This is illustrated below, 
where the key 10 is inserted into the tree on the left. 





If this is not the case, we rebuild the tree in a way to be described now, 
with the result that some unary node is moved to the insertion point. Then we 
proceed as above. 

The rebuilding is composed of two basic operations: a horizontal slide of a 
unary node, and a vertical redistribution of a unary node on some level into 
four unary nodes two levels below. The slide moves a unary node within the 
same level by shifting children between neighboring nodes, as illustrated below. 
Triangles designate subtrees of the same height. Besides the shifting of children 
on the level above the triangles, the keys in the marked nodes must be moved 
around to preserve the in-order ordering of the keys. 




* 




This is essentially the unary-binary formulation of the observation stated in 
the beginning of section 4. In this way, a unary node on any level can move a 
horizontal distance of d nodes at a rebuilding cost of 0{d). The slide operation 
in unary-binary trees appears in [13], which has been a starting point for this 
paper. 

The redistribution of a unary node into four unary nodes two levels below 
proceeds as follows. 





This clearly takes 0(1) time. A redistribution makes the lowest of the two 
groups involved grow by two nodes. To achieve the desired complexity, we need 
the groups to retain their approximate sizes. Therefore, when a group has dou- 
bled in size, we split it into two groups of equal size. 



80 



Rolf Fagerberg 



To specify where to do a redistribution, we define the parent node of a group 
to be the leftmost node two levels above the group, which has all of its grand- 
children inside the group. ^ 

Assume now that an insertion is to take place below the binary node v at 
level one. The process that makes v unary, allowing us to complete the insertion, 
can be described in pseudo-code as a call Request(u) to the following recursive 
procedure: 

Procedure REQUESx(r!) 

1. if there is a unary node inside v’s group then 

2. slide it to v 

3. else 

4- REQUESx(parent node of v’s group) 

5. redistribute the parent node of v’s group 

6. slide a unary node in v’s group to v 

7. if v’s group has grown too big then 

8. split the group into two of equal size 



We show in the next section that the unary nodes at level /imax will last for 
the number of insertions stated in Theorem 1. We also show that the amortized 
complexity of an insertion is 0(l/e). 



6 Analysis 

The reasoning behind the good amortized behavior of our rebalancing scheme is 
as follows: To ensure the desired complexity, unary nodes on level one are forced 
to be 0(l/e) nodes apart. This in turn forces the weight stored at each level to 
decrease exponentially when going upwards in the tree, as the total weight must 
be less than eN. Correspondingly, the group sizes - and hence the cost of a slide 
- increase exponentially, in our set-up by a factor of two for each new level in 
the layers. To counteract this increasing cost, the unary nodes in the layers are 
distributed on every second level, as this means that on average 2^ = 4 requests 
for a unary node at level 1 -I- 2(i — 1) are made for every request for a unary node 
at level 1 -I- 2i. Thus, on average, an insertion requires /e work to be done 
on level 1 -I- 2i, which summed over all i is 0(l/e). This calculation is slightly 
perturbed by the need to split groups, but it turns out that by charging each 
insertion (3/4)*/e work on level 1 -I- 2i, more than adequate is charged to cover 
the work done in group splittings. 

The potential function (f below provides a formalization of this argument. 
Lemma 4. The amortized rebalancing cost for an insertion is 0(l/e). 

® It is possible for a group borderline to divide the grandchildren of a node 



The Complexity of Rebalancing a Binary Search Tree 



81 



Proof. We define a suitable potential function. By the original size of a group, 
we mean the size, after the initialization described in section 4, of the original 
group from which it has descended by a series of group splittings. Let G be a 
group on level 1 + 2i containing j nodes, of which u are unary, and let jo be its 
initial size. We define the potential 4>{G) of the group G as 

max{0, 4 - u} • 3*+^/e + (j - jo){3/2)\ 

and the potential 4>{T) of the entire tree T as the sum of 4>{G) over all groups G 
in T. Initially, all groups have (at least) four unary nodes, so (j){T) is zero. 
Clearly, (f>{T) is never negative. Thus (see [16]), the amortized rebalancing cost 
of any operation on T is the actual rebalancing cost plus the net increase in 4>(T). 
Below, we assume that the unit of work used in (j) is larger than all multiplicative 
constants in the O-expressions mentioned. 

The rebalancing consists of an initial instance of Request, and possibly a 
series of recursive instances (i.e. instances invoked by other instances of Re- 
quest). Each instance of Request performs some operations, namely one slide 
(line 2 or line 6), zero or one redistribution (line 5), and zero or one splitting 
of a group (line 8). The amortized rebalancing costs of these operations depend 
on the level of the node given as an argument to the particular instance of 
Request. 

The actual rebalancing cost of a slide at level 1 -I- 2z is in 0(2® /e), by the 
bound on the the group size from Lemma 3 (as groups never become larger than 
twice their original size). This is also the amortized cost, as 4>{T) is not changed. 

A group splitting has an actual rebalancing cost of zero.^ Before the splitting, 
we have one group of size® 2jo, containing four nodes. After the splitting, we 
have two groups of size jo, containing four nodes in total. Hence, a splitting on 
level 1-1-2* incurs an increase in 4>{T) of 4 • 3^+^/e — jo ■ (3/2)®. As jo > 2®+®/e 
by Lemma 3, the amortized rebalancing cost of a group splitting is actually 
negative. 

A redistribution from level 1 -|- 2* to level 1 -I- 2(* — 1) removes one unary 
node from the upper group involved, and hence increases its potential by 3®“'"^/e. 
The lower group involved contains no unary nodes and receives four. Also, its 
size grows by two. Hence, its potential decreases by 4 • 3®“*'^/e — 2(3/2)®“^. The 
actual rebalancing cost of a redistribution is 0(1), making its total amortized 
cost 1-b 2(3/2)®-! -3®+Ve. 

All recursive calls to Request (line 4) are immediately followed by a redis- 
tribution (line 5). Hence, the combined amortized cost of the slide in a recursive 
instance of Request and the redistribution immediately following the call in- 
voking this instance is 2®/e-|- 14-2(3/2)®-! — 3®“*'!/e. This is negative for all* > 1, 
as e < 1/2. 

“! Of course, it takes ©(size of the group) work to decide where to split. This work 
may be included in the count for the slide preceding the split. 

® Actually, if jo is odd, the size is 2 jo + 1 before the splitting and jo and jo + 1 after 
the splitting, since groups always grow by two nodes. We leave it to the reader to 
verify that the amortized cost of a slide is the same in this case. 



82 



Rolf Fagerberg 



By this telescoping effect, the total sum of the amortized rebalancing costs 
for all the operations performed during the rebalancing for an insertion, except 
the slide operation in the initial instance of Request, is negative. This slide 
operation takes place at level one, and hence, as seen above, has an amortized 
cost of 0{2'^/e). Apart from a call to Request, an insertion also converts a 
unary node on level one to a binary node. This increases ipiT) by 3^/e. In total, 
the amortized rebalancing cost of an insertion is 0(l/e). □ 

The amortized cost just calculated does not include the initialization of the 
rebalancing scheme. This initialization consists of a global rebuilding of the tree, 
and hence take 0{N) time. As stated in the lemma below, the rebalancing scheme 
will last for more than eN/2 insertions. Thus, amortized over these operations, 
the initialization cost is 0(l/e) per insertion. 

Lemma 5. If all unary nodes at level /imax have been removed by redistributions, 
at least 5eN/8 insertions have taken place. 

Proof. The lemma can be proved using the following facts: The total weight of 
the initial R unary nodes in the reservoir is at least 3eA^/4. The total weight 
in the layers can grow (as groups get larger) from its initial value, but never to 
more than eN/8, as can be proved using Lemma 3. Weight can only disappear 
during insertions, not during rebalancing, and each insertion removes a weight 
of exactly one. 

The details appear in the full paper. □ 

7 Conclusion 

In this paper, we have given a rebalancing scheme for binary search trees which 
allows an optimal trade-off between the height bound and the rebalancing cost. 

We have, however, only dealt with amortized complexity. For worst case com- 
plexity, the amortized lower bound in [8] of course still applies. The best existing 
upper bound is height [log(n-|- 1) -|- min{l/iy/(n), log(n ) / f{n)}~\ maintained in 
0(/(n)) worst case time, by a combination of results in [3] and [8], and it re- 
mains an open problem to close this gap. We conjecture the lower bound to be 
tight. 

References 

1. G. M. Adel’son-Verskii and E. M. Landis. An Algorithm for the Organisation 
of Information. Dokl. Akad. Nauk SSSR, 146:263-266, 1962. In Russian. English 
translation in Soviet Math. Dokl., 3:1259-1263, 1962. 72 

2. A. Andersson. Optimal bounds on the dictionary problem. In Proc. Symp. on 
Optimal Algorithms, Varna, volume 401 of LNCS, pages 106-114. Springer- Verlag, 
1989. 73 

3. A. Andersson. Effcient Search Trees. PhD thesis. Department of Computer Science, 
Lund University, Sweden, 1990. 73, 73, 82 



The Complexity of Rebalancing a Binary Search Tree 



83 



4. A. Andersson, C. Icking, R. Klein, and T. Ottmann. Binary search trees of almost 
optimal height. Acta Informatica, 28:165-178, 1990. 73 

5. A. Andersson and T. W. Lai. Fast updating of well-balanced trees. In SWAT’90, 
volume 447 of LNCS, pages 111-121. Springer- Verlag, 1990. 73 

6. A. Andersson and T. W. Lai. Comparison-effcient and write-optimal searching and 
sorting. In ISA’91, volume 557 of LNCS, pages 273-282. Springer- Verlag, 1991. 73, 
75 

7. N. Blum and K. Mehlhorn. On the average number of rebalancing operations in 
weight-balanced trees. Theoretical Computer Science, 11:303-320, 1980. 72 

8. R. Fagerberg. Binary search trees: How low can you go? In SWAT ’96, volume 1097 
of LNCS, pages 428-439. Springer- Verlag, 1996. 73, 73, 74, 82, 82 

9. L. J. Guibas and R. Sedgewick. A Dichromatic Framework for Balanced Trees. In 
19th FOCS, pages 8-21, 1978. 72 

10. D. E. Knuth. Sorting and Searching, volume 3 of The Art of Computer Program- 
ming. Addison- Wesley, 1973. 74 

11. T. Lai. Effcient Maintenance of Binary Search Trees. PhD thesis. Department of 
Computer Science, University of Waterloo, Canada., 1990. 73, 73 

12. T. Lai and D. Wood. Updating almost complete trees or one level makes all the 
difference. In STACS’90, volume 415 of LNCS, pages 188-194. Springer- Verlag, 
1990. 73 

13. H. A. Maurer, T. Ottmann, and H.-W. Six. Implementing dictionaries using binary 
trees of very small height. Inf. Proc. Letters, 5:11-14, 1976. 73, 79 

14. J. Nievergelt and E. M. Reingold. Binary search trees of bounded balance. SIAM 
J. on Computing, 2(l):33-43, 1973. 72 

15. T. Ottmann, D. S. Parker, A. L. Rosenberg, H. W. Six, and D. Wood. Minimalcost 
brother trees. SIAM J. Computing, 13(1):197-217, 1984. 74 

16. R. E. Tarjan. Amortized computational complexity. SIAM J. on Algebraic and 
Discrete Methods, 6:306-318, 1985. 81 



Fast Allocation and Deallocation 
with an Improved Buddy System* 



Erik D. Demaine and Ian J. Munro 



Department of Computer Science, University of Waterloo 
Waterloo, Ontario N2L 3G1, Canada 
{eddemaine , imunro}@uwaterloo . ca 



Abstract. We propose several modifications to the binary buddy sys- 
tem for managing dynamic allocation of memory blocks whose sizes are 
powers of two. The standard buddy system allocates and deallocates 
blocks in 0{lgn) time in the worst case (and on an amortized basis), 
where n is the size of the memory. We present two schemes that improve 
the running time to 0(1) time, where the time bound for deallocation 
is amortized. The first scheme uses one word of extra storage compared 
to the standard buddy system, but may fragment memory more than 
necessary. The second scheme has essentially the same fragmentation as 
the standard buddy system, and uses 0(2^^+''^^)*®*^") bits of auxiliary 
storage, which is ai(lg*^n) but o(n®) for all fc > 1 and e > 0. Finally, we 
present simulation results estimating the effect of the excess fragmenta- 
tion in the first scheme. 



1 Introduction 

The binary buddy system [13] is a well-known system for maintaining a dynamic 
collection of memory blocks. Its main feature is the use of suitably aligned blocks 
whose sizes are powers of two. This makes it easy to check whether a newly 
deallocated block can be merged with an adjacent (unused) block, using bit 
operations on the block addresses. See Section 1.1 for a more detailed description 
of the method. 

While the buddy system is generally recognized as fast, we argue that it is 
much slower than it has to be. Specifically, the time to allocate or deallocate 
a block of size 2^ is 6>(1 — A: -I- Ign) in the worst case, where n is the size of 
the memory in bytes. Not only is this a worst-case lower bound, but this much 
time can also be necessary on an amortized basis. Once we encounter a block 
whose allocation requires 0(1 — fc-|-lgn) time, we can repeatedly deallocate and 
reallocate that block, for a total cost of 0(m(l — k + Ign)) over m operations. 
Such allocations and deallocations are also not rare; for example, if the memory 
is completely free and we allocate a constant-size block, then the buddy system 
uses 0(lg n) time. Throughout this paper we assume standard operations on a 
word of size 1 -I- Ig n or so bits can be performed in constant time. 

* This work was supported by the Natural Science and Engineering Research Council 
of Canada (NSERC). 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 84—96, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Fast Allocation and Deallocation with an Improved Buddy System 



85 



1.1 Buddy System 

The (binary) buddy system was originally described by Knowlton [11,12]. It is 
much faster than other heuristics for dynamic memory allocation, such as first-fit 
and best-fit. Its only disadvantage being that blocks must be powers of two in 
size, the buddy system is used in many modern operating systems, in particular 
most versions of UNIX, for small block sizes. For example, BSD [18] uses the 
buddy system for blocks smaller than a page, i.e., 4 kilobytes. 

The classic description of the buddy system is Knuth’s [13]. Because our work 
is based on the standard buddy system, we review the basic ideas now. 

At any point in time, the memory consists of a collection of blocks of con- 
secutive memory, each of which is a power of two in size. Each block is marked 
either occupied or free, depending on whether it is allocated to the user. For each 
block we also know its size (or the logarithm of its size). The system provides 
two operations for supporting dynamic memory allocation: 

1. Allocate (2^): Finds a free block of size 2^, marks it as occupied, and returns 
a pointer to it. 

2. Deallocate (B): Marks the previously allocated block B as free and may 
merge it with others to form a larger free block. 

The buddy system maintains a list of the free blocks of each size (called a free 
list), so that it is easy to find a block of the desired size, if one is available. If no 
block of the requested size is available. Allocate searches for the first nonempty 
list for blocks of at least the size requested. In either case, a block is removed 
from the free list. This process of finding a large enough free block will indeed 
be the most difficult operation for us to perform quickly. 

If the found block is larger than the requested size, say 2^ instead of the 
desired 2®, then the block is split in half, making two blocks of size 2^“^. If this 
is still too large (fc — 1 > i), then one of the blocks of size 2^“^ is split in half. 
This process is repeated until we have blocks of size 2^“^, 2^“^, . . . , 2®+^, 2®, 
and 2®. Then one of the blocks of size 2® is marked as occupied and returned to 
the user. We will modify this splitting process as the first step in speeding up 
the buddy system. 

Now when a block is deallocated, the buddy system checks whether the block 
can be merged with any others, or more precisely whether we can undo any splits 
that were performed to make this block. This is where the buddy system gets its 
name. Each block B\ (except the initial blocks) was created by splitting another 
block into two halves, call them B\ (with the same address but half the size) 
and i? 2 - The other block, B 2 , created from this split is called the buddy of B\, 
and vice versa. The merging process checks whether the buddy of a deallocated 
block is also free, in which case the two blocks are merged; then it checks whether 
the buddy of the resulting block is also free, in which case they are merged; and 
so on. 

One of the main features of the buddy system is that buddies are very easy 
to compute on a binary computer. First note that because of the way we split 
and merge blocks, blocks stay aligned. More precisely, the address of a block of 



86 



Erik D. Demaine and Ian J. Munro 



size 2^ (which we always consider to be written in binary) ends with k zeros. As 
a result, to find the address of the buddy of a block of size 2^ we simply flip the 
(fc+l)st bit from the right. 

Thus it is crucial for performance purposes to know, given a block address, 
the size of the block and whether it is occupied. This is usually done by storing 
a block header in the first few bits of the block. More precisely, we use headers 
in which the first bit is the occupied bit, and the remaining bits specify the size 
of the block. Thus, for example, to determine whether the buddy of a block is 
free, we compute the buddy’s address, look at the first bit at this address, and 
also check that the two sizes match. 

Because block sizes are always powers of two, we can just encode their loga- 
rithms in the block headers. This uses only Ig Ig n bits, where n is the number of 
(smallest) blocks that can be allocated. As a result, the smallest practical header 
of one byte long is sufficient to address up to 2^^® « 3.4 • 10®® blocks. Indeed, if 
we want to use another bit of the header to store some other information, the 
remaining six bits suffice to encode up to 2®"* « 1.8 • 10®® blocks, which should 
be large enough for any practical purposes. 

1.2 Related Work 

Several other buddy systems have been proposed, which we briefly survey now. 
Of general interest are the Fibonacci and weighted buddy systems, but none of 
the proposals theoretically improve the running time of the Allocate and Deallo- 
cate operations. 

In Exercise 2.5.31 of his book, Knuth [13] proposed the use of Fibonacci 
numbers as block sizes instead of powers of two, resulting in the Fibonacci buddy 
system. This idea was detailed by Hirschberg [9] , and was optimized by Hinds [8] 
and Cranston and Thomas [7] to locate buddies in time similar to the binary 
buddy system. Both the binary and Fibonacci buddy systems are special cases 
of a generalization proposed by Burton [5]. 

Shen and Peterson [23] proposed the weighted buddy system which allows 
blocks of sizes 2^ and 3 • 2^ for all k. All of the above schemes are special 
cases of the generalization proposed by Peterson and Norman [20] and a further 
generalization proposed by Russell [22]. Page and Hagins [19] proposed an im- 
provement to the weighted buddy system, called the dual buddy system, which 
reduces the amount of fragmentation to nearly that of the binary buddy sys- 
tem. Another slight modification to the weighted buddy system was described 
by Bromley [3,4]. Koch [14] proposed another variant of the buddy system that 
is designed for disk-file layout with high storage utilization. 

The fragmentation of these various buddy systems has been studied both 
experimentally and analytically by several papers [4,6,17,20,21,22]. 

1.3 Outline 

Sections 2 and 3 describe our primary modifications to the Allocate and Deallo- 
cate operations of the binary buddy system. Finding an appropriate free block 



Fast Allocation and Deallocation with an Improved Buddy System 



87 



for Allocate is the hardest part, so our initial description of Allocate assumes 
that such a block has been found, and only worries about the splitting part. In 
Sections 4 and 5 we present two methods for finding a block to use for allocation. 
Finally, Section 6 gives simulation results comparing the fragmentation in the 
two methods. 

2 Lazy Splitting 

Recall that if we allocate a small block out of a large block, say 2® out of 2^ units, 
then the standard buddy system splits the large block k — i times, resulting in 
subblocks of sizes 2^“^, 2^“^, ..., 2*+^, 2*, and 2®, and then uses one of the 
blocks of size 2b The problem is that if we immediately deallocate the block of 
size 2®, then all fc — * + 1 blocks must be remerged into the large block of size 2^. 
This is truly necessary in order to discover that a block of size 2^ is available; 
the next allocation request may be for such a block. 

To solve this problem, we do not explicitly perform the first k — i — 1 splits, 
and instead jump directly to the last split at the bottom level. That is, the large 
block of size 2^ is split into two blocks, one of size 2* and one of size 2^ — 2b Note 
that the latter block has size not equal to a power of two. We call it a superblock, 
and it contains allocatable blocks of sizes 2®, 2®+^, . . . , 2^“^, and 2^“^ (which 
sum to the total size 2^ — 2*). For simplicity of the algorithms, we always remove 
the small block of size 2® from the left side of the large block of size 2^, and 
hence the allocatable blocks contained in a superblock are always in order of 
increasing size. 

In general, we maintain the size of each allocated block as a power of two, 
while the size of a free block is either a power of two or a difference of two powers 
of two. Indeed, we can view a power of two as the difference of two consecutive 
powers of two. Thus, every free block can be viewed as a superblock, containing 
one or more allocatable blocks each of a power of two in size. 

To see how the free superblocks behave, let us consider what happens when we 
allocate a subblock of size 2^ out of a free superblock of size 2^ — 2b The free block 
is the union of allocatable blocks of sizes 2®, 2*+^, . . . ,2^“^, and hence i < j < 
k—1. Removing the allocatable block of size 2^ leaves two consecutive sequences: 
2*, 2®+^, . . . , 2^~^ and 2^+^, 2-^+^, . . . , 2^“^. Thus, we split the superblock into the 
desired block of size 2^ and two new superblocks of sizes 2^ — 2® and 2^ — 2^+^. 



2.1 Block Headers 

As described in Section 1.1, to support fast access to information about a block 
given just its address (for example when the user requests that a block be deal- 
located) it is common to have a header on every block that contains basic infor- 
mation about the block. Recall that block headers in the standard buddy system 
store a single bit specifying whether the block is occupied (which is used to test 
whether a buddy is occupied), together with a number specifying the logarithm 
of the size of the block (which is used to find the buddy). 



Erik D. Demaine and Ian J. Munro 



With our modifications, superblocks are no longer powers of two in size. The 
obvious encoding that uses 0(lgn) bits causes the header to be quite large- 
a single byte is insufficient even for the levels at which the buddy system is 
applied today (for example, smaller than 4,096 bytes in BSD 4.4 UNIX [18]). 
Fortunately, there are two observations which allow us to leave the header at its 
current size. The first is that allocated blocks are a power of two in size, and 
hence the standard header suffices. The second is that free blocks are a difference 
of two powers of two in size, and hence two bytes suffice; the second byte can be 
stored in the data area of the block (which is unused because the block is free) . 



2.2 Split Algorithm 

To allocate a block of size 2^, we first require a free super block containing a 
properly aligned block of at least that size, that is, a superblock of size 2^ — 2* 
where k > i and k > j. Finding this superblock will be addressed in Sections 4 
and 5. The second half of the Allocate algorithm is to split that superblock 
down to the appropriate size, and works as follows. Assume the superblock B 
at address a has an appropriate size. First examine the header of the block 
at address a. By assumption, the occupied bit must be clear, i.e., B must be 
free. The next two numbers of the header are k and i and specify that B has 
size 2^ — 2*. In other words, B is & superblock containing allocatable blocks of 
size 2*, 2*+^, . . . , 2^“^, in that order. 

There are two cases to consider. The first case is that one of the blocks in B 
has size 2^ , that is, i < j < k — The address of this block is a + J2m=i 2™ = 
a + 2-1 — 2b First we initialize the header of this block, by setting the occupied 
bit and initializing the logarithm of the size to j. The address of this block is 
also returned as the result of the Allocate operation. Next B has to be split into 
the allocated block and, potentially, a super block on either side of it. If j < k—1 
(in other words, we did not allocate the last block in the superblock), then we 
need a superblock on the right side. Thus, we must initialize the block header 
at address a + 2^+^ — 2* with a clear occupied bit followed by k and j + 1. (That 
is, the block has size 2^ — 2-^+^.) Similarly, if j > i (in other words, we did not 
allocate the first block in the superblock), then we need a superblock on the left 
side. Thus, we modify the first number in the header at address a from k to j, 
thereby specifying that the block now has size 2^ — 2® . 

The second case is the undesirable case in which i > j, in other words we 
must subdivide one of the blocks in superblock B to make the allocation. The 
smallest block in B, of size 2®, is adequate. It is broken off and the remainder 
of B is initialized as a free superblock of size 2^ — 2®+^. The block of size 2® is 
broken into the allocated block of size 2^ , and a super block of size 2® — 2^ which 
is returned to the structure for future use. 

An immediate consequence of the above modification to the split operation 
is the following: 

Lemma 1. The cost of a split is constant. 



Fast Allocation and Deallocation with an Improved Buddy System 



89 





11 2 



16 



32 



superblock superblock 

just deallocated by the user 



superblock 



16 



32 



superblock 



superblock 



16 



16 



32 



superblock 



64 



Fig. 1. An example in which blocks have been aggressively merged into su- 
perblocks, but a single deallocation causes 0(lgn) merges. 



3 Unaggressive Merging 

This section describes how merging works in combination with blocks whose 
sizes are not powers of two. Our goal is for merges to undo already performed 
splits, because the conditions that caused the split no longer hold. However, we 
are not too aggressive about merging: we do not merge adjacent superblocks 
into larger superblocks. Instead, we wait until a collection of superblocks can be 
merged into a usual block of a power of two in size. This is because we will only 
use superblocks to speed up splits. An amortized time bound for merges follows 
immediately, and unfortunately this kind of “aggressive merging” is not enough 
to obtain a worst-case time bound; see Fig. 1. 

Hence, our problem reduces to detecting mergeable buddies in the standard 
sense, except that buddies may not match in size: the left or right buddy may 
be a much larger superblock. This can be done as follows. Suppose we have just 
deallocated a block B and want to merge it with any available buddies. First we 
clear the occupied bit in the header of B. Next we read the logarithm of the size 
of the block, call it i, and check whether B can be merged with any adjacent 
blocks, or in other words whether it can be merged with its buddy, as follows. 

Because of the alignment of allocated blocks, the last i bits of the address 
of B must be zeros. If the (z-l-I)st bit from the right is a zero, then our block is a 
left buddy of some other block; otherwise, it is a right buddy. In either case, we 
can obtain the address of H’s buddy by flipping the (i-l-l)st bit of H’s address, 
that is, by taking a bitwise exclusive-or applied to 1 shifted left i times. 

If the header of H’s buddy has the occupied bit clear, we read its size 2^ — 2A 
If H’s size equals the lacking size 2^ (i.e, i = j), we merge the buddies and 
update the header to specify a size of 2^ . In this case, we repeat the process to 
see whether the buddy of the merged block is also free. 



Lemma 2. The cost of a sequence of merges is constant amortized. 



90 



Erik D. Demaine and Ian J. Munro 



Proof. The total number of individual merges is at most twice the number of 
already performed (and not remerged) splits, and hence each sequence of merge 
operations takes 0(1) amortized time. □ 

4 Finding a Large Enough Free Block: Fragmentation 

This section presents our first approach to the remaining part of the Allocate 
algorithm, which is to find a block to return (if it is of the correct size) or split 
(if it is too large). More precisely, we need to find a block that is at least as large 
as the desired size. The standard buddy system maintains a doubly linked list of 
free blocks of each size for this purpose. Indeed, the free list is usually stored as 
a doubly linked list whose nodes are the free blocks themselves (since they have 
free space to use) . The list must be doubly linked to support removal of a block 
in the middle of the list as the result of a merge. 

We do the same, where a superblock of size 2^ — 2® is placed on the free list 
for blocks of size 2^“^, corresponding to the largest allocatable block contained 
within it. This will give us the smallest superblock that is large enough to handle 
the request. However, it may result in splitting a block when unnecessary; we 
shall readdress this issue in the next section. 

The difficulty in finding the smallest large-enough superblock is that when 
(for example) there is a single, large block and we request the smallest possible 
block, it takes 0(lg n) time to do a linear scan for the appropriate free list. To find 
the appropriate list in 0(1) worst-case time, we maintain a bitvector of length 
[IgnJ, whose (t-l-l)st bit from the right is set precisely if the list for blocks of 
size 2® is nonempty. Then the next nonempty list after or at a particular size 2^ 
can be found by first shifting the bitvector right by fc, and then computing the 
least-significant set bit. 

The latter operation is included as an instruction in many modern machines. 
Newer Pentium chips do it as quickly as an integer addition. It can also be 
computed in constant time using boolean and basic arithmetic operations [2]. 
Another very simple method is to store a lookup table of the solutions for all 
bitstrings of length 0{rP) for some constant e > 0, using 0{rP) words of space; 
cut the bitvector into [1/e] chunks; and check for a set bit in each chunk from 
right to left. This 0{rP) extra space is justified because many data structures 
require this operation, and it is perfectly reasonable for the operating system to 
provide a common static table for all processes to access. 

Theorem 1. The described modifications to the buddy system cause Allocate 
[Deallocate] to run in constant worst-case [amortized] time. 



5 Finding a Large Enongh Free Block: Extra Storage 

One unfortunate property of the method described above is that even if a block 
of the desired size is available as part of a superblock, it may not be used because 
preference is given to a larger block. The reason is that our method prefers a 



Fast Allocation and Deallocation with an Improved Buddy System 



91 



superblock whose largest allocatable block is minimal. Unfortunately, such a 
superblock may not contain an allocatable block of exactly (or even close to) 
the desired size, whereas a superblock containing a larger largest block might. 
Furthermore, even if there is no block of exactly the desired size, our method 
will not find the smallest one to split. As a result, unnecessary splits may be 
performed, slightly increasing memory fragmentation. We have not performed a 
statistical analysis of the effect in fragmentation as a result of this property, but 
simulation results are presented in Section 6. 

In this section, we present a further modification that solves this problem 
and leaves the fragmentation in essentially the same state as does the standard 
buddy system. Specifically, we abstract the important properties of the standard 
buddy system’s procedure for finding a large enough free block into the following 
minimum- splitting requirement', the free block chosen must be the smallest block 
that is at least the desired size. In particular, if there is a block of exactly the 
desired size, it will be chosen. This requirement is achieved by the standard 
buddy system, and the amount of block splitting is locally minimized by any 
method achieving it. 

Of course, there may be ties in “the smallest block that is at least the desired 
size,” so different “minimum-splitting” methods may result in different pairs of 
blocks becoming available for remerging, and indeed do a different amount of 
splitting on the same input sequence. However, we observe that even different 
implementations of the “standard buddy system” will make different choices. 
Furthermore, if we view all blocks of the same size as being equally likely to 
be deallocated at any given time, then all minimum-splitting systems will have 
identical distributions of fragmentation. 

In the context of the method described so far, the minimum-splitting re- 
quirement specifies that we must find a superblock containing a block of the 
appropriate size if one exists, and if none exists, it must find the superblock 
containing the smallest block that is large enough. This section describes how to 
solve the following more difficult problem in constant time: find the super block 
whose smallest contained block is smallest, over all super blocks whose largest 
contained block is large enough to serve the query. 

Recall that superblocks have size 2^ — 2-1 for 1 < A: < [Ign] and 0 < j < fc— 1. 
For each of the possible {k,j) pairs, we maintain a doubly linked list of all 
free superblocks of size 2^ — 2-1 (where “superblock” includes the special case 
of “block”). By storing the linked list in the free super blocks themselves, the 
auxiliary storage required is only 6*(lg^ n) pointers or 0(lg^ n) bits. 

For each value of A:, we also maintain a bitvector 14 of length [Ig nj , whose jth 
bit indicates whether there is at least one superblock of size 2^ — 2-1. This vector 
can clearly be maintained in constant time subject to superblock allocations and 
deallocations. By finding the least-significant set bit, we can also maintain the 
minimum set j in 14 for each k, in constant time per update. 

The remaining problem is to find some super block of size 2^ — 2^ for which, 
subject to the constraint k > i, j is minimized. This way, if j < i, this superblock 
contains a block of exactly the desired size; and otherwise, it contains the smallest 



92 



Erik D. Demaine and Ian J. Munro 



block adequate for our needs. The problem is now abstracted into having a 
vector Vrain, whosc fcth element (1 < fc < [Ig nj ) is the minimum j for which there 
is a free superblock of size 2^ — 2^; in other words, T4iin[fc] = min{j | Vk[j] > 0}. 
Given a value i, we are to find the smallest value of j in the last [IgnJ — i 
positions of Knin; in other words, we must find min{Vmin[fc] | k > i}. 

The basic idea is that, because each element of T^iin value takes only Iglgn 
bits to represent, “many” can be packed into a single (l + lgn)-bit word. Indeed, 
we will maintain a dynamic multiway search tree of height 2. The [Ig nj elements 
of Knin are split into roughly \/lgn groups of roughly \/Ign elements each. 
The pth child of the root stores the elements in the pth group. The root contains 
roughly -^/Ign elements, the pth of which is the minimum element in the pth 
group. As a consequence, each node occupies -\/lg n Ig Ig n bits. 

A query for a given i is answered in two parts. First we find the minimum of 
the first \i/^/\g nj elements of the root node, by setting the remaining elements 
to infinities, and using a table of answers for all possible root nodes. The second 
part in determining the answer comes from inspecting the \i/ ^\gn\t\i branch 
of the tree, which in general will contain some super blocks (or j values) that 
are valid for our query and some that are not. We must, then, make a similar 
query there, and take the smallest j value of the two. The extra space required is 
dominated by the table that gives the value and position of the smallest element, 
for all possible ^Ign tuples of Iglgn bits each. There are 2^^®"*® entries in 
this table, and each entry requires 2 Ign bits, for a total of igig"+i 

As a consequence, the total space required beyond the storage we are managing 
is o(n®) but w(lg^n). Updating the structure to keep track of the minimum j 
for each fc, in constant time after each memory allocation or deallocation, is 
straightforward . 

Theorem 2. The described modifications to the buddy system satisfy the 
minimum- splitting requirement, and cause Allocate [Deallocate] to run in con- 
stant worst-case [amortized] time. 



6 Simulation 

To help understand the effect of the excess block splitting in the first method 
(Section 4), we simulated it together with the standard buddy system for various 
memory sizes and “load factors.” Our simulation attempts to capture the spirit 
of the classic study by Knuth [13] that compares various dynamic allocation 
schemes. In each time unit of his and our simulations, a new block is allocated 
with randomly chosen size and lifetime according to various distributions, and 
old blocks are checked for lifetime expiry which causes deallocations. If there is 
ever insufficient memory to perform an allocation, the simulation halts. 

While Knuth periodically examines memory snapshots by hand to gain qual- 
itative insight, we compute various statistics to help quantify the difference be- 
tween the two buddy systems. To reduce the effect of the choice of random 
numbers, we run the two buddy schemes on exactly the same input sequence. 



Fast Allocation and Deallocation with an Improved Buddy System 



93 



repeatedly for various sequences. We also simulate an “optimal” memory alloca- 
tion scheme, which continually compacts all allocated blocks into the left fraction 
of memory, in order to measure the “difficulty” of the input sequence. 

Few experimental results seem to be available on typical block-size and life- 
time distributions, so any choice is unfortunately guesswork. Knuth’s block sizes 
are either uniformly distributed, exponentially distributed, or distributed ac- 
cording to a hand-selection of probabilities. We used the second distribution 
(choosing size 2* with probability 1/(1 -I- [IgnJ)), and what we guessed to be a 
generalization of the third distribution (choosing size 2® with probability 2“*“^, 
roughly). We dropped the first distribution because we believe it weights large 
blocks too heavily — blocks are typically quite small. Note that because the two 
main memory-allocation methods we simulate are buddy systems, we assume 
that all allocations ask for sizes that are powers of two. Also, to avoid rapid 
overflow in the second distribution, we only allow block sizes up to i.e., 

logarithms of block sizes up to | Ig n. 

Knuth’s lifetimes are uniformly distributed according to one of three ranges. 
We also use uniform distribution but choose our range based on a given param- 
eter called the load factor. The load factor L represents the fraction of memory 
that tends to be used by the system. Given one of the distributions above on 
block size, we can compute the expected block size E, and therefore compute a 
lifetime Ln/E that will on average keep the amount of memory used equal to 
Ln (where n is the size of memory). To randomize the situation, we choose a 
lifetime uniformly between 1 and 2Ln/E — 1, which has the same expected value 
Ln/E. 

The next issue is what to measure. To address this it is useful to define a 
notion of the system reaching an “equilibrium.” Because the simulation starts 
with an empty memory, it will start by mostly allocating blocks until it reaches 
the expected memory occupancy, Ln. Suppose it takes t time steps to reach that 
occupancy. After t more steps (a total of 2t), we say that the system has reached 
an equilibrium] at that point, it is likely to stay in a similar configuration. (Of 
course, it is possible for the simulation to halt before reaching an equilibrium, 
in which case we discard that run.) 

One obvious candidate for a quantity to measure is the amount of fragmen- 
tation (i.e., the number of free blocks) for each method, once every method has 
reached an equilibrium. However, this is not really of interest to the user: the 
user wants to know whether her/his block can be allocated, or whether the sys- 
tem will fail by being unable to service the allocation. This suggests a more 
useful metric, the time to failure, frequently used in the area of fault tolerance. 

A related metric is to wait until all systems reach an equilibrium (ignoring 
the results if the system halts before that), and then measure the largest free 
allocatable block in each system. For the standard buddy system, this is the 
largest block of size a power of two; for our modified buddy system, it is the 
largest block in any superblock; and for the optimal system, it is simply the 
amount of free memory. This measures, at the more-or-less arbitrary time of all 
systems reaching equilibrium, the maximum-size block that could be allocated. 




Relative error Relative error 



94 



Erik D. Demaine and Ian J. Munro 



Time to failure, exponential dist. 



Equilibrium max. block size, exponential dist. 





Time to failure, uniform dist. Equilibrium max. block size, uniform dist. 





Fig. 2. Simulation results. The distributions refer to the distributions of the 
logarithms of the block sizes, and “load” refers to the load factor. 



We feel that these two metrics capture some notion of what users of a 
memory-allocation system are interested in. By evaluating them for all three 
systems under the same inputs, we can measure the difference between the two 
buddy systems, relative to the optimal system. This kind of “relative error” was 
measured for 100 runs and then averaged, for each case. Memory size ranges 
between 2^ (the smallest power-of-two size for which a difference between the 
two buddy systems is noticeable) and 2^^ (the size used by BSD 4.4 UNIX [18]). 
The tested load factors are 50%, 75%, and 90%. 

The results are shown in Fig. 2. The relative errors are for the most part 
quite small (typically under 5%). Indeed, our first method occasionally does 
somewhat better than the standard buddy system, because its different choices 
of blocks to split cause some fortunate mergings. Further evidence is that, for 
the exponential distribution, less than 10% of the runs showed any difference in 
time-to-failure between the two systems. (However, the number of differences is 
greater for the uniform distribution.) 

Thus, the difference in distributions of fragmentation between the two buddy 
systems seems reasonably small. The simplicity of our first method may make it 
attractive for implementation. 



Fast Allocation and Deallocation with an Improved Buddy System 



95 



7 Conclusion 

We have presented two enhancements to the buddy system that improve the 
running time of Allocate to constant worst-case time, and Deallocate to constant 
amortized time. The more complex method keeps the distribution of fragmen- 
tation essentially the same as the standard method, while the simpler approach 
leads to a different and slightly worse distribution. It would be of interest to 
specify this difference mathematically. 

We note that it is crucial for Allocate to execute as quickly as possible (and 
in particular fast in the worst case), because the executing process cannot pro- 
ceed until the block allocation is complete. In contrast, it is reasonable for the 
Deallocate time bound to be amortized, because the result of the operation is 
not important and the actual work can be delayed until the CPU is idle (or the 
memory becomes full). Indeed, this delay idea has been used to improve the cost 
of the standard buddy system’s Deallocate [1,10,15,16]. On the other hand, for 
the purposes of theoretical results, it would be of interest to obtain a constant 
worst-case time bound for both Allocate and Deallocate. 

References 

1. R. E. Barkley and T. Paul Lee. A lazy buddy system bounded by two coalescing 
delays per class. Operating Systems Review, pages 167-176, Dec. 1989. 95 

2. Andrej Brodnik. Computation of the least significant set bit. In Proceedings of the 
2nd Electrotechnical and Computer Science Conference, Portoroz, Slovenia, 1993. 
90 

3. Allan G. Bromley. An improved buddy method for dynamic storage allocation. In 
Proceedings of the 7th Australian Computer Conference, pages 708-715, 1976. 86 

4. Allan G. Bromley. Memory fragmentation in buddy methods for dynamic storage 
allocation. Acta Informatica, 14:107-117, 1980. 86, 86 

5. Warren Burton. A buddy system variation for disk storage allocation. Communi- 
cations of the ACM, 19(7):416-417, July 1976. 86 

6. Shyamal K. Chowdhury and Pradip K. Srimani. Worst case performance of 
weighted buddy systems. Acta Informatica, 24(5):555-564, 1987. 86 

7. Ben Cranston and Rick Thomas. A simplified recombination scheme for the Fi- 
bonacci buddy system. Communications of the ACM, 18(6):331-332, June 1975. 
86 

8. James A. Hinds. Algorithm for locating adjacent storage blocks in the buddy 
system. Communications of the ACM, 18(4):221-222, Apr. 1975. 86 

9. Daniel S. Hirschberg. A class of dynamic memory allocation algorithms. Commu- 
nications of the ACM, 16(10):615-618, Oct. 1973. 86 

10. Arie Kaufman. Tailored-list and recombination-delaying buddy systems. ACM 
Transactions on Programming Languages and Systems, 6(1):118-125, Jan. 1984. 
95 

11. Kenneth C. Knowlton. A fast storage allocator. Communications of the ACM, 
8(10):623-625, Oct. 1965. 85 

12. Kenneth C. Knowlton. A programmer’s description of L6. Communications of the 
ACM, 9(8):616-625, Aug. 1966. 85 



96 



Erik D. Demaine and Ian J. Munro 



13. Donald E. Knuth. Dynamic storage allocation. In The Art of Computer Program- 
ming, volume 1, section 2.5, pages 435-455. Addison- Wesley, 1968. 84, 85, 86, 

92 

14. Philip D. L. Koch. Disk file allocation based on the buddy system. ACM Trans- 
actions on Computer Systems, 5(4):352-370, Nov. 1987. 86 

15. T. Paul Lee and R. E. Barkley. Design and evaluation of a watermark-based lazy 
buddy system. Performance Evaluation Review, 17(1):230, May 1989. 95 

16. T. Paul Lee and R. E. Barkley. A watermark-based lazy buddy system for kernel 
memory allocation. In Proceedings of the 1989 Summer USENIX Conference, pages 
1-13, June 1989. 95 

17. Errol L. Lloyd and Michael C. Loui. On the worst case performance of buddy 
systems. Acta Informatica, 22(4):451-473, 1985. 86 

18. Marshall Kirk McKusick, Keith Bostic, Michael J. Karels, and John S. Quarter- 
man. The Design and Implementation of the 4-4 BSD Operating System. Addison- 
Wesley, 1996. 85, 88, 94 

19. Ivor P. Page and Jeff Hagins. Improving the performance of buddy systems. IEEE 
Transactions on Computers, C-35(5):441-447, May 1986. 86 

20. James L. Peterson and Theodore A. Norman. Buddy systems. Communications 
of the ACM, 20(6):421-431, June 1977. 86, 86 

21. Paul W. Purdom, Jr. and Stephen M. Stigler. Statistical properties of the buddy 
system. Journal of the ACM, 17(4):683-697, Oct. 1970. 86 

22. David L. Russell. Internal fragmentation in a class of buddy systems. SIAM 
Journal on Computing, 6(4):607-621, Dec. 1977. 86, 86 

23. Kenneth K. Shen and James L. Peterson. A weighted buddy method for dynamic 
storage allocation. Communications of the ACM, 17(10):558-562, Oct. 1974. See 
also the corrigendum in 18(4) :202, Apr. 1975. 86 



Optimal Bounds for Transformations of 
a;- Automata 



Christof Loding 



Lehrstuhl fiir Informatik VII, 
RWTH Aachen, D-52056 Aachen 
loedingOinf ormatik . rwth-aachen . de 



Abstract. In this paper we settle the complexity of some basic con- 
structions of w-automata theory, concerning transformations of automata 
characterizing the set of w-regular languages. In particular we consider 
Safra’s construction (for the conversion of nondeterministic Biichi au- 
tomata into deterministic Rabin automata) and the appearance record 
constructions (for the transformation between different models of deter- 
ministic automata with various acceptance conditions). Extending re- 
sults of Michel (1988) and Dziembowski, Jurdzihski, and Walukiewicz 
(1997), we obtain sharp lower bounds on the size of the constructed 
automata. 



1 Introduction 

The theory of w-automata offers interesting transformation constructions, al- 
lowing to pass from nondeterministic to deterministic automata and from one 
acceptance condition to another. The automaton models considered in this pa- 
per are nondeterministic Biichi automata [1], and deterministic automata of 
acceptance types Muller [12], Rabin [13], Streett [17], and parity [11]. There 
are two fundamental constructions to achieve transformations between these 
models. The first is based on the data structure of Safra trees [14] (for the trans- 
formation from nondeterministic to deterministic automata), and the second 
on the data structure of appearance records [2,3,5] (for the transformations be- 
tween deterministic Muller, Rabin, Streett, and parity automata). In this paper, 
we show that for most of the transformations, these constructions are optimal, 
sharpening previous results from the literature. This requires an analysis and 
extension of examples as proposed by Michel [10] and Dziembowski, Jurdzihski, 
and Walukiewicz [4]. 

The first construction of deterministic Rabin automata from nondeterminis- 
tic Biichi automata is due to McNaughton [9]. Safra’s construction [14] general- 
izes the classical subset construction by introducing trees of states (Safra trees) 
instead of sets of states, yielding a complexity of (where n is the number 

of states of the Biichi automaton). Using an example of Michel [10] one obtains 
the optimality of Safra’s construction in the sense that there is no conversion 
of nondeterministic Biichi automata with n states into deterministic Rabin au- 
tomata with 2®(”) states and 0{n) pairs in the acceptance condition (see the 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 97—109, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



98 



Christof Loding 



survey [18]). A drawback is the restriction to Rabin automata with 0{n) pairs. In 
the present paper we eliminate this restriction. This is shown in Section 3. Also, 
using results of Section 4, we obtain an optimal bound for the transformation of 
nondeterministic Biichi automata into deterministic Streett automata. 

For the transformations between deterministic models, the construction of 
appearance records, introduced by Biichi [2,3] and Gurevich, Harrington [5] in 
the context of infinite games, is useful. To transform Muller automata into other 
deterministic automata one uses state appearance records (SAR). The main com- 
ponent of an SAR is a permutation of states, representing the order of the last 
visits of the states in a run. This leads to a size of {0{n))l of the resulting 
automaton, where n is the number of states of the original automaton. 

For the transformation of Rabin or Streett automata into other deterministic 
models one uses index appearance records (lAR). The main component of an 
lAR is a permutation of the indices of the pairs (of state sets) in the acceptance 
condition, representing the order of the last visits of the first components of 
these pairs. This leads to a size of {0{r))l of the resulting automaton, where r 
is the number of pairs of the original automaton. 

Dziembowski, Jurdzihski, and Walukiewicz [4] have studied the state appear- 
ance records as memory entries in automata which execute winning strategies in 
infinite games. They presented an example of an infinite game over a graph with 
2n vertices where each winning strategy requires a memory of size n\. Start- 
ing from this example we introduce families of languages which yield optimal 
lower bounds for all automata transformations which involve appearance record 
constructions (either in the form of SAR or lAR) . 

Table 1 at the end of the paper lists the different transformations considered. 
In this paper we show that all these transformations involve a factorial blow up. 

The lower bounds as exposed in this paper show that in w-automata theory 
the single exponential lower bound as known from the subset construc- 

tion from the classical theory of automata on finite words has been extended 
to So we see in which sense it is necessary to pass from sets of states 

(classical case) to sequences or even trees of states. 

We leave open the question of an optimal lower bound for the transformation 
of nondeterministic Biichi automata into deterministic Muller automata. 

The present results are from the author’s diploma thesis [8]. Thanks to Wolf- 
gang Thomas for his advice in this research. 



2 Notations and Definitions 



For an arbitrary set X we denote the set of infinite sequences (or infinite words) 
over X by A“ and the set of finite words over X by X* . For a sequence a G A“ 
and for an i S IN the element on the Ah position in a is denoted by cr(i), i.e., 
a = (t(0)(7(1)(t(2) • • •. The infix of a from position i to position j is denoted by 
a[i,j]. We define In{a), the infinity set of cr, to be the set of elements from X 
that appear infinitely often in a. The length of a word w G X* is denoted by juij. 



Optimal Bounds for Transformations of o'- Automata 



99 



An Lu- automaton A is a tuple {Q, S,qo,S,Acc). The tuple {Q, S,qo,S) is 
called the transition structure of A, where Q 7 ^ 0 is a finite set of states, E 
is a finite alphabet, and go is the initial state. The transition function (5 is a 
function S : Q X E ^ Q for deterministic automata and 6 : Q x E ^ 2^ for 
nondeterministic automata. The last component Acc is the acceptance condition. 

A run of A on a word a G E'^ is an infinite state sequence cr € such that 
(t( 0) = go and for all i € IN one has a{i + 1) = 5{a{i)^a{i)) for deterministic 
automata, and a{i + 1 ) S 6{a{i),a{i)) for nondeterministic automata. 

A run is called accepting iff it satisfies the acceptance condition. We will 
specify this below for the different forms of acceptance conditions. The language 
L{A) that is accepted or recognized by the automaton is defined as L{A) = {a G 
E'^ I there is an accepting run of A on a\. 

In this paper we consider acceptance conditions of type Biichi, Muller, Rabin, 
Street! , and parity. 

Let A = {Q, E ,qQ,5i -^cc) be an w-automaton. For the different acceptance 
types mentioned above the acceptance condition Acc is given in different forms. 

A Biichi condition [1] refers to a set F C Q, and a run cr of a Biichi automaton 
is defined to be accepting iff In{a) C] F ^ (d. 

A Muller condition [12] refers to a system of sets F C 2*^; a run cr of a Muller 
automaton is called accepting iff In{a) G T . 

A Rabin condition [13] refers to a list of pairs 17 = {(ifi, Fi), . . . , {Er, Fr)} 
with Ei,Fi <G Q for i = 1, . . . r. A run cr of a Rabin automaton is called accepting 
iff there exists an i G { 1 , . . . , r} such that In{a) C\ Fi ^ ^ and In{a) C\ Ei = ^. 

A Street! condition [17] also refers to such a list 17 but it is used in the dual 
way. A run cr of a Streett automaton is called accepting iff for every i G {1, . . . , r} 
one has In(cr) n Fi = 0 or In{a) n yf 0 . 

A parity condition [11] refers to a mapping c : Q ^ {0 , . . . ,k} with fc G N. 
A run cr of a parity automaton is called accepting iff min{c(g) j g G In{a)} is 
even. The numbers 0, . . . , fc are called colors in this context. 

Obviously the Muller condition is the most general form of acceptance condi- 
tion. This means every automaton A of the form above can be transformed into 
an equivalent Muller automaton just by collecting the sets of states that satisfy 
the acceptance condition of A. 

Let us note that Parity conditions can also be represented as Rabin and as 
Streett conditions: 

Proposition 1 . Let A = {Q, E,S,qo,c) with c : Q ^ {0,...,/c} be a parity 
automaton and let r = [|j. Let 17 = {{Eq, Eq), . . . , {Er, Fr)} with = {g G 
Q 1 c(g) < 2i} and F^ = {g G Q ] c(g) = 2i} for i = 0,...,r. Furthermore 
let 17' = {(Fq, Fq), . . . , (F' , Fr)} with F' = {g G Q | c(g) < 2i -|- 1} and F' = 
{g G <5 1 c(g) = 2i -I- 1} for i = 0,...,r. Then the Rabin automaton Ai = 
{Q, E,S,qo, n) and the Streett automaton A 2 = {Q, E,S,qo, f2') are equivalent 
to A. 

For a deterministic automaton A with a Muller, Rabin, Streett, or parity 
condition we give a deterministic automaton recognizing the complementary 



100 Christof Loding 



language. This automaton is called the dual of A. For a Muller or a parity 
automaton the dual automaton is of the same type. For a Rabin automaton the 
dual automaton is a Streett automaton and vice versa. 

Proposition 2. (i) Let A = (Q, U, Sj <Jo, be a deterministic Muller automa- 
ton. The deterministic Muller automaton A' = (Q, S, Sj \ recognizes 
E^\L{A). 

(ii) Let A = (Q, if, i5, (?q, 17) be a deterministic Rabin (Streett) automaton. 
The deterministic Streett (Rabin) automaton A' = (Q, If, 5, go, f?) recognizes 
E'^\L{A). 

(iii) Let A = {Q, If, i5, qo, c) be a deterministic parity automaton. The deter- 
ministic parity automaton A! = (Q, If, i5, goi c') with c'(g) = c(g) -I- 1 for every 
q G Q recognizes If“ \ L{A). 

Because of the special structure of Rabin conditions on the one hand and 
Streett conditions of the onther hand, we can state the following about the 
union of infinity sets. 

Proposition 3. (i) Let A = (Q, If, i5, go, 17) be a Streett automaton and let 
R, S C Q he two infinity sets of possible runs satisfying the acceptance condition 
of A. Then a run with infinity set RU S also satisfies the acceptance condition 
of A. 

(ii) Let A = (Q, If, S, go, 17) be a Rabin automaton and let R, S C Q he two 
infinity sets of possible runs not satisfying the acceptance condition of A. Then 
a run with infinity set RU S also does not satisfy the acceptance condition of A. 

3 Optimality of Safra’s Construction 

In this section we show the optimality of Safra’s construction ([14]) which trans- 
forms a nondeterministic Biichi automaton with n states into a deterministic 
Rabin automaton with states. The main part of the proof consists of 

Lemma 5, which states that there exists a family (L „)„>2 of languages, such 
that Ln can be recognized by a nondeterministic Biichi automaton with 0{n) 
states, but the complement of can not be recognized by a nondeterministic 
Streett automaton with less than n! states. 

This lemma is essentially due to Michel, who proved that there is a family 
{Ln)n >2 of languages, such that can be recognized by a nondeterministic 
Biichi automaton with 0{n) states, but the complement of can not be rec- 
ognized by a nondeterministic Biichi automaton with less than n! states. Here 
we use the same family of languages as Michel but show the stronger result 
that there is no nondeterministic Streett automaton with less than nl states 
recognizing the complement of 

We define the languages via Biichi automata A„ over the alphabet If„ = 
{1, . . . , n, #}. Later we adapt the idea for languages over a constant alphabet. 
For a technical reason we use a set of initial states instead of one initial state, 



Optimal Bounds for Transformations of o'- Automata 



101 



but recall that we can reduce the automata to the usual format by adding one 
extra state. 

Define the automaton An = {Qm -S'n, Qo, 5n,Fn) as follows. 

- Qn = {qo,qi, ■ ■ ■ ,g«}, Qo = and Fn = {go}- 

— The transition function 5n is defined by 

Sn{qo,a) = {qa} for a e (1, . . . ,n|, 

<5n(go,#) = 0, 

Sn{qt,a) = {q^} for a e e (1, . . . ,n|,a yf i, 

Sn{qt,i) = {gj,goj for i G (1, ... ,nj. 

The automaton is shown in Figure 1. The idea can be adjusted to automata 
with the constant alphabet {a, 6,#} by coding i G — 1} with Ab, n 

with a^a*b and # with #. The resulting automaton is shown on the right hand 
side of Figure 1 and still has 0{n) states. 




Fig. 1. The transition structure of the Biichi automaton An- On the left hand 
over the alphabet {1, . . . , n, #} and on the right hand over the alphabet {a, &, #}. 
A nondeterministic Streett automaton for the complementary language needs at 
least n! states. 



As an abbreviation we define = L{An)- Before we prove the main lemma, 
we first give a characterization of the languages which is not difficult to 
prove. 

Lemma 4. Let n G IN and a G Ajj’. Then the following two statements are 
equivalent. 

(i) a G Ln- 

(ii) There exist ii, . . . Ak G {1, . . . , nj such that each pair i\i 2 , ■ ■ ■ , ik-iikAkH 
appears infinitely often in a. 

(Our definition corrects an inaccuracy of [18], where the automata are defined 
in the same way but with go as only initial state. For the languages defined in 
this way condition (ii) has to be sharpened.) 



102 Christof Loding 



Lemma 5. Let n >2. The complement \ Ln of L„ can not be recognized by 
a nondeterministic Streett automaton with less than n! states. 

Proof. Let n € N and let A' = (Q', gQ, 5', 17) be a Streett automaton with 
L' := L{A') = \ Ln and let {ii . . .in), {ji ■ • . jn) be different permutations 

of (1 . . . n). Define a = {ii . . . inff)‘^ and j3 = {ji . . . jnff)‘^ ■ From Lemma 4 we 
obtain a,/3 ^ L' . Thus we have two successful runs ra,rp € of A' on a,/3. 
Define R = In{ra) and S = In^rp). If we can show RTS = %, then we are done 
because there are nl permutations of ( 1 ... n). 

Assume RTS ^ 0. Under this assumption we can construct 7 G 17“ with 
€ Ln T L' . This is a contradiction, since L' is the complement of 

Let q G RT S. The we can choose a prefix u of a leading from go to g, an 
infix of a containing the word ii .. .in, leading from q to q through all states 
of R, and an infix w of P containing the word ji . . .jn, leading from q to q 
through all states of S. 

Now let 7 = u(vw)^ . A run r~^ of A' on 7 first moves from go to g (while 
reading u), and then cycles alternatingly through R (while reading v) and S 
(while reading w). Therefore r.y has the infinity set RU S. Because R and S 
satisfy the Streett condition, i?US' also does (Proposition 3) and we have 7 G L' . 

To show 7 G Ln we first note that, if k is the lowest index with ik jk, then 
there exist l,m > k with jk = ii and ik = jm - By the choice of the words v and w 
one can see that 7 contains infinitely often the segments ii .. .in and ji . . .jn- 
Thus 7 also contains the segments ikik+i, ■ ■ ■ ,ii-iii,jkjk+i,- ■ ■ ,jm-ijm- Now, 
using Lemma 4, we can conclude 7 G Ln. 



Theorem 6. There exists a family {Ln)n >2 of languages such that for every 
n the language Ln can be recognized by a nondeterministic Bilchi automaton 
with 0(n) states but can not be recognized by a deterministic Rabin automaton 
with less than nl states. 

Proof. Consider the family of languages from Lemma 5. Let n G N. Assume there 
exists a deterministic Rabin automaton with less than nl states recognizing L„. 
The dual of this automaton is a deterministic Streett automaton with less than 
nl states recognizing 17“ \ L„. This contradicts Lemma 5. 

Since parity conditions are special cases of the Rabin conditions. Theorem 6 also 
holds for parity automata instead of Rabin automata. 

The theorem sharpens previous results of literature (see the survey [18]), 
where it is shown that there is no conversion of Biichi automata with 0{n) 
states into deterministic Rabin automata with states and 0{n) pairs. We 
point out that our proof is almost the same as in [18]. Only a few changes where 
needed to get this slightly stronger result. 

The example demonstrates the optimality of Safra’s construction for the 
transformation of nondeterministic Biichi automata into deterministic Rabin 
automata. For the transformation of nondeterministic Biichi automata into de- 
terministic Muller automata this question is open. The known lower bound for 



Optimal Bounds for Transformations of o'- Automata 



103 



this transformation is ([16])- In the following we will see that the example 
from above can not be used to show an optimal lower bound for Muller automata: 
We construct Muller automata A4n, n € IN, with 0{n^) states recognizing the 
language 

For n G IN define the Muller automaton Mn = {Q'm ^n, q'o, Tn) by Q'^ = 

X = (#, #), S^((i,j'),a) = (j, a), and F G iff there exist ii, . . . , G 

such that ( 11 ,^ 2 ),- • .,{ik-i,ik), (ik,ii) € F. 

The automaton just collects all pairs of letters occurring in the input word 
and then decides, using the Muller acceptance condition, if the property from 
Lemma 4 is satisfied. Thus we have L(A4„) = 

In the Muller automata every superset of an accepting set is accept- 
ing too. Therefore the languages considered may even be recognized by deter- 
ministic Biichi automata ([7]). Thus, we can restrict the domain of the regular 
o;-languages, and get a sharpened version of Theorem 6: The factorial blow up 
in the transformation of Biichi automata into deterministic Rabin automata al- 
ready occurs over the class Gs of those languages which are acceptable by a 
deterministic Biichi automaton. 

One may also ask for a lower bound for the transformation of nondeter- 
ministic Biichi automata into deterministic Streett automata. The result we 
obtain belongs to this section and therefore we mention it here: There exists a 
family (L „)„>2 of languages (L„ over an alphabet of size n) such that for ev- 
ery n the language can be recognized by a nondeterministic Biichi automaton 
with 0{n) states but can not be recognized by a deterministic Streett automaton 
with less than nl states. 

The proof uses results from the next section and thus the claim will be 
restated there (Theorem 8). 



4 Optimality of the Appearance Record Constructions 

Appearance records [2,3,5], abbreviated AR, serve to transform Muller, Rabin, 
and Streett automata into parity automata. For these constructions two differ- 
ent forms of AR’s are used, namely state appearance records (SAR) and index 
appearance records (lAR). 

The SAR construction (see e.g. [18]) serves to transform deterministic Muller 
automata with n states into an equivalent deterministic parity automata with 
{0{n))\ states and 0{n) colors. Since parity automata are special kinds of Rabin 
and Streett automata (Proposition 1), this construction also transforms Muller 
automata into Rabin or Streett automata. 

The lAR construction (see e.g. [15]) transforms a deterministic Streett au- 
tomaton with n states and r pairs into an equivalent parity automaton with 
n ■ {0{r))\ states and 0{r) colors. Because of the duality of Rabin and Streett 
conditions and the self duality of parity conditions (Proposition 2), the lAR con- 
struction can be used for all nontrivial transformations between Rabin, Streett, 
and parity automata. 



104 Christof Loding 



In this section we will show that all the AR constructions are of optimal 
complexity. The idea for the proof originates in [4] , where the optimality of the 
SAR as memory for winning strategies in Muller games was shown. Just to avoid 
confusion we would like to point out that the family of automata we use in our 
proof is not just an adaption of the games from [4]. The winning condition of the 
games and the acceptance condition of the automata are not related. Our proof 
also does not generalize the one from [4], because the used family of automata 
can not be adapted to games requiring memory n\. 

We first give a theorem showing the optimality of the lAR construction for 
the transformation of Streett into Rabin automata and then explain how to 
apply the theorem to get the optimality of all other AR transformations. 

Theorem 7. There exists a family {Ln)n >2 of languages sueh that for every n 
the language can he recognized by a deterministic Streett automaton with Ofn) 
states and Ofn) pairs but can not be recognized by a deterministic Rabin automa- 
ton with less than n\ states. 

Proof. We define the languages via deterministic Streett automata An over 
the alphabet {1, . . . ,n}. Later we will explain how we can adapt the proof for 
an alphabet of constant size. The transition structure of An is shown schemat- 
ically in Figure 2. Formally, for n > 2, we define the Streett automaton An = 
{Qn, Sn, Qq, Sn, aS followS. 

~ Qn = {-n, . . . , -1, 1, . . . , n} and qf; = -1. 

~ For i,j e {1, . . . ,n} let j) = -j and j) = j. 

- f2n = {{Ei,Fi),. . . ,{En,Fn)} with E^ = {*} and F^ = {-i}. 



0 0 





Fig. 2. The transition structure of the Streett automaton An. On the left hand 
over the alphabet {1, . . . , n} and on the right hand over the alphabet {a, b}. An 
equivalent deterministic Rabin automaton needs at least n! states. 



To characterize the words in we use the following notation. For a word 
a G Ef let even{a) be the set containing the letters that infinitely often oc- 
cur on an even position in a and let odd{a) be the set containing the letters 



Optimal Bounds for Transformations of w- Automata 105 



that infinitely often occur on an odd position in a.. This means even(a) = 
In(a(0)a(2)a(4) ■ ■ ■) and odd{a) = Jn(a(l)a(3)a(5) • • •). From the definition 
of An follows that a word a G is in L„ if and only if odd{a) C even{a). As a 
consequence of this, for a G and u G S* with |m| even, the word ua is in L„ 
if and only if a is in T„. Therefore, in a deterministic automaton recognizing L„, 
every state that can be reached by reading a prefix of even length can be used 
as initial state without changing the accepted language. 

We will prove by induction that every deterministic Rabin automaton recog- 
nizing Ln needs at least n! states. 

We show the base case of the induction for n = 2. An automaton recognizing 
a nonempty proper subset of needs at least 2 states. Therefore the base case 
of the induction holds. 

Now let n > 2 and let B = {Q, Sn, qoA, be a deterministic Rabin au- 
tomaton with L{B) = Ln- Let Qeven be the states that can be reached from 
by reading a prefix of even length. 

For every i G {1, . . . ,n} and every q G Qeven we construct a deterministic 
Rabin automaton B^ over by removing all z-transitions from B. Further- 

more q is the initial state of B'^. Since q can be reached in B after having read 
a prefix of even length, the language recognized by B^ is L„_i (if z yf n then 
the names of the letters are different but the language essentially equals L„_i). 
Thus, by the induction hypothesis, Bf has at least (zz — 1)! states. 

We can strengthen this statement as follows. In every B‘1 (z G {!,..., n} 
and q G Qeven) is a strongly connected component with at least (n — 1)! states. 
Just take a strongly connected component S in Bf such that there is no other 
strongly connected component reachable from S in B^. Let p G S be a state 
that is reachable from q in B^ by reading a prefix of even length. As we have 
seen above, we can use p as initial state in B^ without changing the accepted 
language. Therefore, by the induction hypothesis, S must contain at least (rz— 1)! 
states. 

Now, for z G {1, . . . , zz}, we construct words az G with runs (Tz of B such 
that |/zz((Jz)| > (zz — 1)! and In{(Ji) n In{aj) = 0 for z yf j. Then we are done 
because \Q'\ > > zz • (zz — 1)! = zz!. 

For z G {1, . . . , zz} construct the word oz as follows. First take a zzq G (A„ \ 
{z})* such that zzq has even length and contains every letter from \ {z} on 
an even and on an odd position. Furthermore Bf° should visit at least (zz — 1)1 
states while reading zzq. This is possible since Bf° contains a strongly connected 
component with > (zz— 1)1 states. Let q\ be the state reached by Bf° after having 
read the word u^ik, where k G {1, . . . , zz}\{z}. Then we choose a word zzi G (L7„\ 
{z})* with the same properties as ug, using Bp instead of Bp . This means zzi 
has even length, contains every letter from \ {z} on an even and on an odd 
position, and Bp visits at least (zz — 1)1 states while reading u\. 

Repeating this procedure we get a word oz = ugikuiiku 2 ik • ■ ■ with even{ai) 
= {1, . . . , rz} \ {z}, odd{ai) = zz}, and therefore oz ^ L„. For the run 

(Tz of A' on Ui we have |Jzz((Tz)| > {n — 1)!. Hence it remains to show In{(Ji) n 
In{(jj) = 0 for z yf j. 




106 Christof Loding 



Assume by contradiction that there exist i j with In(ai)C]In(jjj) ^ 0. Then 
we can construct a word a with run a such that even(a) = even(ai)Ueven(aj) = 
{1, . . . , n}, odd{a) = odd{ai)Uodd{aj) = {1, . . . , n} and In{a) = In{ai)\Jln{uj), 
by cycling alternatingly through the infinity sets of Ui and Uj (as in the proof of 
Lemma 5). This is a contradiction since in Rabin automata the union of rejecting 
cycles is rejecting (Proposition 3), but a is in 

To adapt the proof for an alphabet of constant size we can code every letter 
i G {1, . . . , n — 1} with and n with a^a*b. The resulting automaton looks like 
shown on the right hand side of Figure 2 and still has 0{n) states. 

Theorem 7 shows the optimality of the lAR construction for the transforma- 
tion of deterministic Streett automata into deterministic Rabin automata. The 
duality of these two types of automata (Prop. 2) gives us an analogue theorem, 
with the roles of Rabin automata and Streett automata exchanged. Parity au- 
tomata are special cases of Rabin automata and of Streett automata. Therefore 
we also get analogue theorems for the transformation of Rabin automata into 
parity automata and for the transformation of Streett automata into parity au- 
tomata. Furthermore the property of the example automata to have 0{n) states 
and 0{n) pairs also gives us analogue theorems, when starting with Muller 
automata instead of Rabin or Streett automata. Thus, Theorem 7 shows the 
optimality of all AR constructions listed in Table 1. 

A different construction for the conversion between Rabin and Streett au- 
tomata is given in [6]. It converts a deterministic Streett automaton with n 
states and r pairs into a deterministic Rabin automaton with 0{n ■ r^) states 
and I pairs, where k is the Streett index of the language and I is the Rabin index 
of the language. The Rabin (Streett) index of a language is the number of pairs 
needed in the acceptance condition to describe the language with a deterministic 
Rabin (Streett) automaton. The languages from the family (Ln)n >2 have Rabin 
and Streett index 0{n) and therefore the complexity of the construction is of 
order for our example automata. Hence, as a result of this section, the 

transformation from [6] is also optimal. 

At the end of Section 3 we stated a lower bound for the transformation of 
nondeterministic Biichi automata into deterministic Streett automata. For that 
aim we show that the languages (I7“ \ Ln)n >2 of the present section can be 
recognized by Buchi automata with 0{n) states. Then we are done because 
every deterministic Streett automaton recognizing \ needs at least n! 
states (Theorem 7 and Prop. 2). 

Theorem 8. There exists a family (Ln)n >2 of languages (Ln over an alphabet 
of n letters ) such that for every n the language Ln can he recognized by a non- 
deterministic Biichi automaton with 0{n) states but can not he recognized by a 
deterministic Streett automaton with less than n! states. 

Proof. As mentioned above it suffices to show that there is a family {Bn)n >2 of 
Buchi automata such that Bn has 0{n) states and recognizes Ef \ Ln. From 
the characterization of in the proof of Theorem 7 we know that a € Ln iff 
odd{a) C even(a) and therefore a ^ iff there exists an i € {1, . . . ,n} with 



Optimal Bounds for Transformations of o'- Automata 



107 



i € odd{a) and i ^ even{a). Intuitively the Biichi automaton guesses the i and 
then verifies that it appears infinitely often on an odd position and from some 
point on never on an even position. Formally Bn = {Qn, Qo, Sn, Fn) is defined 
as follows. 



- Qn = {qo,qe,ql,ql,q],---qo,qe^(ff}^ 9 o = 9 o, and = {g}, . . . ,g^}. 

— For i £ Sn and j G {1, . . . , n} let 

Sn{qo,i) = {qe}, Sn{qe,i) = {qo,ql,---,qo}, 



The automaton is built in such a way that it is in one of the states from 
{(7ej 9e: • ■ • : <Ze } ^lie last letter was on an even position. 

Let a G \ Ln and let j G odd{a) \ even{a). A successful run of Bn stays in 
the states qo and qe up to the point where j does not appear on an even position 
anymore. Then it moves to q^. Always when a j appears on an odd position in a, 
the automaton is in and then moves to q’j . Since there does not appear a j on 
an even position anymore, the automaton can continue to infinity and accepts 
a because it visits infinitely often. Therefore we have \ Ln (= L{Bn)- 
Now let a G L(Bn)- There exists a j such that in an accepting run Bn from 
some point on only visits states from {go,gei9/} infinitely often visits g^. 
If the last read letter was on an odd position, then Bn is in ql or in g^ and 
therefore j may only appear on an even position before Bn moves to the states 
{9oj 9ei '?/}• since Bn infinitely often visits gj-, there must be a j on an odd 
position infinitely often and therefore we have L{Bn) Q \ L„. 



Table 1. Synopsis of automaton transformations and pointers to optimality 
results. (The transformation * is the only one that is not known to be optimal.) 



To 

From 


Muller 

det. 


Rabin 

det. 


Streett 

det. 


Parity 

det. 


Biichi 

ndet. 


Safra trees 

~k 


Safra trees 
Thm 6 


1. Safra trees 
2.IAR 
Thm 8 


1. Safra trees 
2.IAR 
Thm 6 


Muller 




SAR 


SAR 


SAR 


det. 




Thm 7 


Thm 7 


Thm 7 


Rabin 


trivial 




lAR 


lAR 


det. 






Thm 7 


Thm 7 


Streett 


trivial 


lAR 




lAR 


det. 




Thm 7 




Thm 7 



108 Christof Loding 



5 Conclusion 

For several different transformations of w-automata we have seen that the lower 
bound is The two basic constructions considered in this paper (Safra 

trees and appearance records) meet these lower bounds and therefore are of 
optimal complexity. In comparison to the theory of ^-automata, where deter- 
minization is exponential too, but with a linear exponent, we get an additional 
factor of logn in the exponent for transformations of w-automata. 

An unsolved problem is the lower bound for the transformation of nondeter- 
ministic Biichi automata into deterministic Muller automata. The known lower 
bound is which can be proven by a simple pumping argument as for 

^-automata. 



References 

1. J.R. Biichi. On a decision method in restricted second order arithmetic. In Proc. 
International Congress on Logic, Method and Philos. Sci. 1960, pages 1-11, 1962. 

97, 99 

2. J.R. Biichi. Winning state-strategies for boolean-Fk games, manuscript, 1981. 97, 

98, 103 

3. J.R. Biichi. State-strategies for games in F^s C Gga- Journal of Symbolic Logic, 
48(4):1171-1198, December 1983. 97, 98, 103 

4. S. Dziembowski, M. Jurdzihski, and I. Walukiewicz. How much memory is needed 
to win infinite games? In Proc. IEEE, LICS, 1997. 97, 98, 104, 104, 104 

5. Y. Gurevich and L. Harrington. Trees, automata and games. In Proc. 14th ACM 
Symp. on the Theory of Computing, pages 60-65, 1982. 97, 98, 103 

6. S. Krishnan, A. Puri, and R. Brayton. Structural complexity of cu-automata. In 
12th Annual Symposium on Theoretical Aspects of Computer Science, volume 900 
of Lecture Notes in Computer Science, pages 143-156. Springer, 1995. 106, 106 

7. L.H. Landweber. Decision problems for tu-automata. Math. System Theory, 3:376- 
384, 1969. 103 

8. C. Loding. Methods for the transformation of u-automata: Complexity and con- 
nection to second order logic. Master’s thesis, Christian-Albrechts-University of 
Kiel, 1998. 98 

9. R. McNaughton. Testing and generating infinite sequences by a finite automaton. 
Information and Control, 9:521-530, 1966. 97 

10. M. Michel. Complementation is much more difficult with automata on infinite 
words. Manuscript, CNET, Paris, 1988. 97, 97 

11. A.W. Mostowski. Regular expressions for infinite trees and a standard form of 
automata. Lecture Notes in Computer Science, 208:157-168, 1984. 97, 99 

12. D.E. Muller. Infinite sequences and finite machines. In Proc. 4th IEEE Symposium 
on Switching Circuit Theory and Logical design, pages 3-16, 1963. 97, 99 

13. M.O. Rabin. Decidability of second order theories and automata on infinite trees. 
Transaction of the AMS, 141:1-35, 1969. 97, 99 

14. S. Safra. On the complexity of w-automata. In Proc. 29th IEEE Symp. on Foun- 
dations of Computer Science, pages 319-327, 1988. 97, 97, 100 

15. S. Safra. Exponential determinization for cu-automata with strong-fairness accep- 
tance condition. In Proc. 2fth ACM Symp. on the Theory of Computing, pages 
275-282, 1992. 103 



Optimal Bounds for Transformations of o'- Automata 



109 



16. S. Safra and M. Y. Vardi. On a;-automata and temporal logic. In Proc. 21th ACM 
Symp. on the Theory of Computing, 1989. 103 

17. R.S. Streett. Propositional dynamic logic of looping and converse is elementary 
decidable. Information and Control, 54:121-141, 1982. 97, 99 

18. W. Thomas. Languages, automata, and logic. In G. Rozenberg and A. Salo- 
maa, editors, Handbook of Formal Language Theory, volume III, pages 385-455. 
Springer- Verlag, 1997. 98, 101, 102, 102, 103 



CTL+ Is Exponentially More Succinct than CTL 



Thomas Wilke 

Lehrstuhl fiir Informatik VII, 
RWTH Aachen, 52056 Aachen, Germany 
wilkeSinf ormat ik . rwth-aachen . de 



Abstract. It is proved that CTL+ is exponentially more succinct than 
CTL. More precisely, it is shown that every CTL formula (and every 
modal /i-calculus formula) equivalent to the CTL^ formula 

E(Fpo A • ■ • A Fp„_i) 

is of length at least which is This matches almost 

the upper bound provided by Emerson and Halpern, which says that for 
every CTL"*" formula of length n there exists an equivalent CTL formula 
of length at most 2" " . 

It follows that the exponential blow-up as incurred in known conversions 
of nondeterministic Biichi word automata into alternation-free /r-calculus 
formulas is unavoidable. This answers a question posed by Kupferman 
and Vardi. 

The proof of the above lower bound exploits the fact that for every CTL 
(p-calculus) formula there exists an equivalent alternating tree automa- 
ton of linear size. The core of this proof is an involved cut-and-paste 
argument for alternating tree automata. 



1 Introduction 

Expressiveness and succinctness are two important aspects to consider when one 
investigates a (specification) logic. When studying the expressiveness of a logic 
one is interested in characterizing what properties can be expressed, whereas 
when studying the succinctness one is interested in how short a formula can be 
found to express a given property. Succinctness is especially of importance in a 
situation where one has characterized the expressive power of a logic by a differ- 
ent but equally expressive logic. In such a situation, succinctness is the foremost 
quantitative measure to distinguish the logics. For instance, linear-time tempo- 
ral logic (LTL) is known to be exactly as expressive as first-order logic (FO), [9], 
but FO is much more succinct than LTL: from work by Stockmeyer’s, [11], it 
follows that there exists a sequence of FO formulas of linear length such that the 
length of shortest equivalent LTL formulas cannot be bounded by an elementary 
recursive function. 

In this paper, the succinctness of computation tree logic (CTL) is compared 
to the succinctness of CTL+, an extension of CTL, which is known to have 
exactly the same expressive power as CTL, [4,5]. I present a sequence of CTL+ 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 110—121, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



CTL'*’ Is Exponentially More Succinct than CTL 111 



formulas of length 0{n) such that the length of shortest equivalent CTL formulas 
is l7(2"/-yn). More precisely, I prove that every CTL formula equivalent to the 
CTL"*" formula 



E(Fpo A • • • A Fp„_i) 

is of length at least (|-„/ 2 ])’ "''^hich shows that CTL+ is exponentially more suc- 
cinct than CTL. This lower bound is almost tight, because a result by Emerson 
and Halpern’s, [4,5], says that for every CTL+ formula of length n there exists 
an equivalent CTL formula of length at most 

It is important to note that this exponential lower bound is not based on 
any complexity-theoretic assumption, and it does not follow from the fact that 
model checking for CTL is known to be P-complete whereas model checking for 
CTL+ is NP- and co-NP-hard (and in A^), [3,4,5]. 

The proof of the lower bound presented in this paper makes use of automata- 
theoretic arguments, following other approaches to similar questions. The main 
idea is based on the following fact. For every CTL formula (and for every ^- 
calculus formula) ip there exists an alternating tree automaton A^p of size linear 
in the length of <p that accepts exactly the models of <p, [6,1,2]. So in order to 
obtain a lower bound on the length of the CTL (or /i-calculus) formulas defining 
a given class of Kripke structures,^ it is enough to establish a lower bound on 
the number of states of the alternating tree automata recognizing the given class 
of structures. 

As mentioned above, automat a-theoretic arguments have been used in this 
way in different places, for instance by Etessami, Vardi, and myself in [8] or 
Kupferman and Vardi in [10]. The difference, however, is that in this paper the 
automaton model (alternating automata on trees) is rather intricate compared 
to the automaton models used in [8] and [10] (nondeterministic automata on 
words and nondeterministic automata on trees, respectively). 

The more elaborate argument that is needed here also answers a question 
raised in the paper by Kupferman and Vardi. A particular problem they con- 
sider is constructing for a given nondeterministic Biichi word automaton an 
alternation free /r-calculus (AFMC) formula that denotes in every Kripke struc- 
ture the set of all worlds where all infinite paths originating in this world are 
accepted by the automaton. They show that if such a formula exists, then there 
is a formula of size at most exponential in the number of states of the given Biichi 
automaton, but they cannot give a matching lower bound. This is provided in 
this paper. 

Outline. In Section 2, the syntax and semantics of CTL and CTL+ are briefly 
reviewed and the main result of the paper is presented. In Section 3, alternating 
tree automata are briefly reviewed and subsequently, in Section 4, the succinct- 
ness problem is reduced to an automata-theoretic problem. Section 5 describes 

^ Strictly speaking, a CTL formula defines a class of pointed Kripke structures, see 
Section 2. 



112 



Thomas Wilke 



the latter in a more general setting and in Section 6 a sketch is given of the solu- 
tion of this more general problem. Section 7 presents consequences, and Section 8 
gives a conclusion. 

This paper is an extended abstract; for details of the proofs, see the technical 
report [12]. 

Acknowledgment. I would like to thank Kousha Etessami, Martin Grohe, Neil 
Immerman, Christof Loding, Philippe Schnoebelen, and Moshe Y. Vardi for hav- 
ing discussed with me the problem addressed in this paper. 

Trees and tree arithmetic. In this paper, a tree is a triple (V, E, A) where (V, E) 
is a directed tree in the graph-theoretic sense and A is a labeling function with 
domain V. By convention, when T denotes a tree, then V, E, and A always 
denote the set of nodes, set of edges, and labeling function of T. The same 
applies to decorations such as T', T* , Ti, etc. 

Let T be an arbitrary tree. A node v' £V is, & successor of a node v € V in T 
if {v, v') G E. The set of all successors of a node v in T is denoted by Scs{T, v). 
The set of leaves of a tree T, that is, the set of nodes without successors, is 
denoted by Lvs{T). The set of inner nodes is denoted by In{T). 

Given a tree T and a vertex v of T, the ancestors path, denoted T'lv, is 
the unique path from the root of T to z; (inclusively). The descendants tree, 
denoted Tlv, is the subgraph of T induced by all nodes reachable from v (v 
itself included) . 

I will use two kinds of concatenations for trees. When T and T' are trees 
and z; is a node of T, then T • {v, T') denotes the tree that results from T by first 
making an isomorphic copy of T' whose node set is disjoint from the set of nodes 
of T and then adding an edge from v to the root of T' . Similarly, T 0 {v,T') 
denotes the tree that results from T by first making an isomorphic copy of T' 
whose node set is disjoint from the set of nodes of T and then identifying the 
root of the isomorphic copy of T' with v. By convention, the node v is retained in 
the resulting tree (rather than the root of T') and the label of v is kept. — These 
two concatenation operations are extended in a straightforward way: when T is 
a tree and M a set of pairs {v, T'), with v G V and T' an arbitrary tree, I might 
write T ■ M and T 0 M to denote the result of concatenating (in the respective 
way) all trees from M to T. 

For ease in notation, when tt is a finite path (a finite tree with exactly one 
leaf) with leaf v and T is a tree, I simply write tt • T for the tree tt • {v,T) as 
defined above. To make things even simpler, I view strings as finite paths and 
vice versa. So when zz is a string and T a tree, I might write u ■ T to denote the 
tree which is obtained by viewing zz as a path and concatenating T to it. 



2 CTL, CTL+, and Main Result 

I start with recalling the syntax and the semantics of CTL and CTL+. For tech- 
nical reasons, I only define formulas in positive normal form. This is not an 



CTL'*’ Is Exponentially More Succinct than CTL 113 



essential restriction, because every CTL formula is equivalent to a CTL formula 
in positive normal form of the same length, and the same applies to CTL’*'. 

Let Prop = {po , pi , p 2 , ■ ■ ■ } be an infinite supply of distinct propositional 
variables. The set of all CTL~^ formulas and the set of all path formulas are 
defined simultaneously as follows. 

1. 0 and 1 are CTL+ formulas. 

2. For p S Prop, p and ^p are CTL"*' formulas. 

3. If ip and tp are CTL’*' formulas, then so are ipW ijj and ip A tp. 

4. Every CTL'*' formula is a path formula. 

5. If ip and ip are CTL+ formulas, then Xip, U{ip,ip), and R{ip,ip) are path 

formulas. 

6. If and ip are path formulas, then so are ipW ip and ip A ip. 

7. If (/j is a path formula, then Rip and f\ip are CTL"*" formulas. 

A CTL'*' formula is a CTL formula when it can be constructed without using 
rule 6. That is, in CTL formulas every path quantifier (E or A) is followed imme- 
diately by a temporal modality (X, U, or R). As usual, I use Rip (eventually ip) 
as an abbreviation for U (!,</?). 

CTL and CTL'*' formulas are interpreted in Kripke structures, which are di- 
rected graphs with specific labeling functions for their nodes. Formally, a Kripke 
structure is a tuple (EF, R, a) where IF is a set of worlds, i? C IF x IF is an 
accessibility relation, and a : IF ^ 2 ^™p is a labeling function, which assigns to 
each world the set of propositional variables that hold true in it. By convention, 
Kripke structures are denoted by K or decorated versions of K such as K' or 
K* , and their components are referred to as IF, R, and a, respectively decorated 
versions of these letters. 

Given a world w of a Kripke structure K as above, a world w' is called a 
successor of w in K if {w,w') G R. Just as with trees, the set of all successors 
of a world w is denoted by Scs{K, w). A path through a Kripke structure K as 
above is a nonempty sequence wq,wi, . . . such that {wq,wi) G R, (wi, W 2 ) G R, 
... A maximal path is a path that is either infinite or finite and ends in a world 
without successors. 

A pointed Kripke structure is a pair {K, w) of a Kripke structure and a 
distinguished world of it. A path through a pointed Kripke structure (K, w) is a 
path through K starting in w. A path-equipped Kripke structure is a pair (K, tt) 
of a Kripke structure and a maximal path through it. 

For every CTL'*' and path formula ip, one defines in a straightforward way 
what it means for ip to hold in a pointed Kripke structure (K, w) respectively 
path-equipped Kripke structure {K, tt) and denotes this by (K, w) \= ip respec- 
tively (K, tt) ^ ip. For instance, when (/? is a path formula, then (K, w) \= Rip if 
there exists a maximal path tt through {K, w) such that {K, tt) ^ ip. For details, 
the reader is referred to [5] . 

Given a CTL'*' formula ip, I write Mod((/?) for the class of all pointed Kripke 
structures that are models of p, i. e., Mod(<^) = {{K,w) \ (K,w) ^ p}. CTL'*' 
formulas p and ip are equivalent if they have the same models, i. e., if Mod(<p) = 
Mod('0). 



114 



Thomas Wilke 



The main result of this paper is: 

Theorem 1. Every CTL formula equivalent to the CTL~^ formula ipn defined by 

(fin = E{Fpo A • • • A Fpn-l) (1) 

has length at least (|-„/ 2 ])> which is 

In other words, CTL~^ is exponentially more succinct than CTL. 

Note that it is easy to come up with a formula of length 0{nl) equivalent 
to ipm namely as a disjunction with n! many disjuncts, each taking care of one 
possible order in which the pfs may occur on a path. 



3 Alternating Tree Automata 

As indicated in the abstract and the introduction, I use an automata-theoretic 
argument to prove Theorem 1. In this section, the respective automaton model, 
which differs from other models used in the literature, is introduced. 

First, it can handle trees with arbitrary degree of branching in a simple way. 
Second, the class of objects accepted by an automaton as defined here is a class 
of pointed Kripke structures rather than just a set of trees. Both facts make it 
much easier to phrase theorems such as Theorem 2 below and also simplify the 
presentation of a combinatorial (lower-bound) argument like the one given in 
Section 6. 

An alternating tree automaton (ATA) is a tuple A = (Q, P, qi, 6, 12) where Q 
is a finite set of states, P is a finite subset of Prop, qj € Q is an initial state, 6 is 
a transition function as specified below, and f2 is an acceptance condition for uj- 
automata such as a Biichi or Muller condition. The same notational conventions 
as with Kripke structures apply. 

The transition function 5 is a function Q x 2^ ^ TC(Q), where TC(Q) is 
the set of transition conditions over Q, which are defined by the following rules. 

1. 0 and 1 are transition conditions over Q. 

2. For every q G Q, q is a, transition condition over Q. 

3. For every q G Q, Dg and Og are transition conditions over Q. 

4. If ip and ip are transition conditions over Q, then tpAtp and ipWip are transition 

conditions over Q. 

A transition condition is said to be e-free if rule 2 is not needed to build it. An 
ATA is e-free if every condition S{q,a) for q G Q and a G 2^ is e-free; it is in 
normal form if it is e-free, the conditions S(q, a) are in disjunctive normal form, 
and neither 0 nor 1 occur in these conditions. 

ATA’s work on pointed Kripke structures. Their computational behavior is 
explained using the notion of a run. Assume A is an ATA as above and (IC, wj) 
a pointed Kripke structure as above. A run of A on {K,wj) is a {W x Q)- 
labeled tree R = (V, E, A) satisfying the conditions described further below. 
To explain these conditions, some more definitions are needed. For simplicity 



CTL'*’ Is Exponentially More Succinct than CTL 115 



in notation, I will write wr{v) and qR{v) for the first and second component 
of X{v), respectively. 

For every node r; of i2, I define what it means for a transition condition r 
over Q to hold in v, denoted K , R,v \= t. This definition is by induction on the 
structure of r, where the boolean constants 0 and 1 and the boolean connectives 
are dealt with in the usual way; besides: 

- K,R,v \= q if there exists v' € Scs{R, v) such that X{v') = {wR{v),q), 

— K,R,v ^ 0(7 if there exists v' S Scs{R,v) and w S Scs{K , wr{v)) such 
that A(z;') = (w,q), and 

— K,R,v 1= Oq if for every w G Scs{K , wr{v)) there exists v' G Scs(R,v) 
such that X(v') = (w,q). 

The two additional conditions that are required of a run are the following. 

1. Initial condition. Let vq be the root of (V,E). Then A(r;o) = (wi,qi). 

2. Local consistency. For every v G V, 

K,R,v^t^ ( 2 ) 

where 

Ty = S{qR{v),a{wR{v))nP) . ( 3 ) 

Note that the intersection with P allows us to deal easily with the fact that in 
the definition of Kripke structure an infinite number of propositional variables 
is always present. 

A run R is said to be accepting if the state labeling of every infinite path 
through R satisfies the given acceptance condition 12. For instance, if 17 C 2*5 is 
a Muller condition, then every infinite path vq,vi,... through R must have 
the property that the set formed by the states occurring infinitely often in 
qR{vo),qR{vi), ... is a member of 17. 

A pointed Kripke structure is accepted by A if there exists an accepting run 
of A on the Kripke structure. The class of pointed Kripke structures accepted 
by A is denoted by /C(A); it is said A recognizes 1C{A). 

Throughout this paper, the same notational conventions as with Kripke struc- 
tures and alternating tree automata apply to runs. 



4 Reduction to Automata-Theoretic Problem 

In order to reduce the lower bound claim for the translation from CTL+ to CTL 
to a claim on alternating tree automata, I describe the models of a CTL formula 
by an alternating tree automaton, following the ideas of Kupferman, Vardi, and 
Wolper, [2], but using the more general model of automaton. 

Let if be an arbitrary CTL formula and P the set of propositional variables 
occurring in ip. The ATA A^ is defined by A^ = {Q, P, p>, S, 17) where Q is the 
set of all CTL subformulas of (p including p itself, 17 is the Muller acceptance 
condition that contains all sets of subformulas of p that do not contain formulas 



116 



Thomas Wilke 



starting with EU or AU, and (5 is defined by induction, where, for instance, the 
inductive step for EU is given by 

S{EU{^, x), a) = X V A OEU(V-, x)) • (4) 

The other cases are similar and follow the ideas of [2]. Note that on the right- 
hand side of (4) the boolean connectives V and A are part of the syntax of 
transition conditions. 

Similar to [2] , one can prove by a straightforward induction on the structure 
of ip: 

Theorem 2. Let ip be an arbitrary CTL formula of length 1 . Then is an 
ATA with at most I states sueh that Modlpp) = 

It is quite easy to see that for every ATA there exists an equivalent ATA in 
normal form with the same number of states. So in order to prove Theorem 1 
we only need to show: 

Theorem 3. Every ATA in normal form reeognizing Modlppn) has at least 

(l'ri/2]) states. 

5 The General Setting 

The method I use to prove Theorem 3 (a cut-and-paste argument) does not only 
apply to the specific properties defined by the pn’s but to a large class of “path 
properties.” As with many other situations, the method is best understood when 
presented in its full generality. In this section, I explain the general setting and 
present the extended version of Theorem 3, namely Theorem 4. 

In the following, word stands for nonempty string or w-word. The set of all 
words over a given alphabet A is denoted by A°° . A language is a subset of 
( 2^)00 p jg some finite subset of Prop. Given a language L over some 

alphabet 2^, EL denotes the class of pointed Kripke structures (K,w) where 
there exists a maximal path through {K, w) whose labeling (restricted to the 
propositional variables in P) is an element of L. (Remember that a path through 
a pointed Kripke structure always starts in its distinguished world.) 

Observe that for every n, we clearly have Mod(:/?„) = EL„ where 

L„ = {aooi • • • G (2^")~ I \/i{i < n 3j{pt € aj))} 

and Pn = {po, ■ ■ ■ ,Pn-i}- 

Let L be a regular language. We say a family {(rti, u'^)}i<m is a diseriminating 
family for L if Uiu'^ G L and Uiu'j ^ L for all i < m and all j < m with 
j yf i. Obviously, the number of classes of the Nerode congruence^ associated 
with L is an upper bound for m. The maximum number such that there exists 
a discriminating family of that size for L is denoted (-(L). 

The generalized version of Theorem 3 now reads: 

^ The Nerode congruence of a language L is the congruence that considers strings u 
and V equivalent if for every word x (including the empty word), ux G L iS vx G L. 



CTL'*’ Is Exponentially More Succinct than CTL 117 



Theorem 4. Let L he a regular language. Then every ATA reeognizing EL has 
at least l{L) states. 

Before we turn to the proof of this theorem in the next section, let’s apply 
it to the languages (as defined above) to obtain the desired lower bounds. 

Fix an arbitrary positive natural number n > 1 and let m = \n/2\ and 
t = (^). Write N for the set {0, . . . , n — 1} and “ for set-theoretic complementa- 
tion with respect to N . For every M C N, let u{M) be a string over 2^" of length 
\M\ such that for every pi G M, the letter {pi} occurs in u{M). (In other words, 
u{M) should be a sequence of singletons where for each i G M the singleton 
{pi\ occurs exactly once and no other singleton occurs.) Let Mq, . . . , Mt-i be 
an enumeration of all m-subsets of N and let Ui = u{Mi) and u' = u{Mi). Then 
{(ui,u')}i<t is a discriminating family for which means t(L„) > (|-„” 2 ])- 

This together with Theorem 4 implies Theorem 3 and thus also Theorem 1. 
(Observe that for n = 1 the claims of Theorems 3 and 1 are trivial.) 

6 Saturation 

In this section, I will introduce the key concepts used in the proof Theorem 4, 
state the main lemmas, provide as much intuition as is possible within the page 
limit, and give a rough outline of the proof of Theorem 4. 

We will see trees in two different roles. On the one hand, we will look at runs 
of ATA’s, and runs of ATA’s are trees by definition. On the other hand, we will 
consider Kripke structures that are trees. In order to not get confused, I will 
strictly follow the notational conventions introduced earlier, for instance, that 
the labeling function of a run R' is referred to by A'. As we will only work with 
Kripke structures that are trees, I will use the term Kripke tree. A Kripke tree 
will also be viewed as a pointed Kripke structure where the root of the tree is 
the distinguished node. 

For the rest of this section, fix a language L over some alphabet 2^ , and an 
ATA A. For each state q, write Aq for the ATA that results from A by changing 
its initial state to q and ICq for the class K.{Aq), the class of pointed Kripke 
structures recognized by Aq. 

Let u be a string. A state q is preventable for u if there exists a Kripke tree K 
such that u-K ^ EL and K ^ ICq. We write pvt{u) for set of all states preventable 
for u, and for every q G pvt{u), we pick, once and for all, a Kripke tree K as 
above and denote it by The important observation here is that if If is a 
Kripke tree, w G W, and q G pvt(JC't'w), then if' defined by if' = if • (w, ICq) 
with u = K]w has the following two properties. First, \i K ^ EL, then if' ^ EL. 
Second, there is no run i? of A on if' that has a node v with wr{v) = w and 
K',R,v 1= Oq. In a certain sense, by adding if^ to if, the condition Uq is 
“prevented” from being used at w. 

A state q is always successful for u if there exists a state q' G pvt(u) such 
that ICq, G ICq. We write scf{u) for the set of states always successful for u, and 
for every q G scf(u), we pick, once and for all, a state q' as above and denote it 



118 



Thomas Wilke 



by g“. (Note that whether or not a state is always successful for a string depends 
on the particular choices for the Jf“’s.) The important observation here is the 
following. Choose K, w, u, and K' as in the previous paragraph. If g S scf(u) 
and if we want to construct a run R oi A on K', then we can always make sure 
that K. R,v \= Oq holds for a node v with wr{v) = w, because we only need to 
add to i? a successful run of A on 1T“, with q' = g“. Formally, if Jf2“/ is such a 
run, we only need to consider R ■ {v, i?“/) instead of R. 

A world w of a Kripke tree K is said to be saturated if for every q G pvt^K'lw), 
there exists w' € Scs{K, w) such that Klw' is isomorphic to with u = K^w. 

Let K be an arbitrary Kripke tree. The Kripke tree is defined by 

= K ■ {(w, K'“) I w G In{K), u = K'lw, and q G pvt{u)} , (5) 

that is, in K^, every inner world from K is saturated. 

Remark 1. Let K be an arbitrary Kripke tree, li K G EL, then G EL. 

This is because every maximal path through R is also present in no 
successors are added to leaves. 

Let T be an arbitrary transition condition over Q and R,V C Q. The X-Y- 
reduct of r, denoted , is obtained from r by replacing 

— every atomic subformula of the form Oq with q G X hy 0, 

— every atomic subformula of the form Oq with g S Q \ A by 1, and 

— every atomic subformula of the form Og with g G K by 1. 

Let R be an arbitrary Kripke tree. A partial run of A on R is defined just 

as an ordinary accepting run with the following modification of local consistency 
as defined in (2). For every v G V such that wr{v) G In{R), it is required that 

( 6 ) 

holds where r„ is as defined in (3) and 

Xy = pvt{R^WR{v)) , Yy = scf{R^WR{v)) . 

Note that in general neither t implies jjqj. ^x,y So there is 

no a priori relation between the existence of runs and partial runs. But using 
Remark 1 and the right notion of restriction of a run one can prove the following. 

Lemma 1. Let R be an arbitrary Kripke tree. Assume JC{A) = EL and R G 
1C{A). Then there exists a partial run of A on R. 

Let i? be a partial run of A on a Kripke tree R. The run R is distributed if 
for every w G W there exists at most one v G V with wr(v) = w. 

The set of all frontier pairs of R, denoted by FrtPrs{R), is defined by 
FrtPrs{R) = {A(u) \ v G V and wr{v) G Lvs{R)}. Similarly, the set of all 
frontier states of R, denoted FrtSts{R), is defined by FrtSts{R) = {qR{v) \ 
V G V and wr{v) G Lvs{R)}. 

The crucial lemma connecting Kripke trees with saturated inner worlds and 
partial runs is as follows. 



CTL'*’ Is Exponentially More Succinct than CTL 119 



Lemma 2. Let K be a Kripke tree and R a distributed partial run of A on K. 
Assume that for every q € FrtSts{R) there exists a Kripke tree Kq S JCq sueh 
that the tree K* defined by 

K* = KQ{{w,Kq)\qe FrtSts{R)} 

does not belong to EL. 

Then there exists an accepting run of A on the Kripke tree K** defined by 
K** = K^Q {(w, Kq) I {w, q) e FrtPrs{R)) , 
which does not belong to EL. 

Note that because R is supposed to be distributed, the trees K* and K** 
are obtained from K and K^, respectively, by adding to each leaf at most one 
of the trees Kg. 

The proof of this lemma is technically involved and makes extensive use of 
the aforementioned properties of preventable and always successful states. 

I will conclude this section with a rough sketch of the proof of Theorem 4. 

Sketch of the Proof of Theorem 4- Let be a discriminating family 

for L of size l{L) and A an ATA with IC{A) = EL. I claim that for every i < m, 
there exists a state q such that u) & ICq, but u' ^ ICq for j < m and j ^ i. This 
clearly implies the claim of the theorem. 

By way of contradiction, assume this is not the case. Then there exists i < m 
such that for every q £ Q with u) £ ICq there exists j yf i such that u' £ JCq. For 
every such q let jq be an appropriate index j . 

Let Khe & |Q|-branching Kripke tree^ such that every maximal path starting 
with the root is labeled UiUi where is the first letter of m'. Consider the Kripke 
tree K' defined by K' = K (•) {{w,u)) \ w £ Lvs(K)}. 

Clearly, K' £ EL (because every maximal path through K' is labeled Uiu'f). 
Thus, by Lemma 1, there exists a partial run of A on K' . By restricting this run 
to the worlds in IT, we obtain a partial run of A on K. This run has the obvious 
property that for every q £ FrtSts{R) there exists an accepting run of Aq on uJ^. 
By manipulating this run adequately, using the fact that K is |Q|-branching, one 
can transform it into a distributed partial run with the same property. This run 
together with the replacing the KqS satisfies the assumptions of Lemma 2. 
We can thus conclude the Kripke tree K** as defined in Lemma 2, which does 
not belong to EL, is accepted by A — a contradiction. 

7 Connection with Biichi Antomata and ^-Calcnlus 

One can show that Theorem 2 also holds for the modal /r-calculus (see, for 
instance, [2]). So we also obtain: every modal /i-calculus formula equivalent to 

® A Kripke tree K is m-branching if for every world w £ W the following is true. For 
every successor wo of w there exist at least m — 1 other successors wi, ... , Wm-i of w 
such that all subtrees K^wo, . . . , Klwm-i are isomorphic. 



120 



Thomas Wilke 



the CTL+ formula ipn has length at least (|-„” 2 ])- interesting because of 

the following. 

As the modal /i-calculus is closed under syntactic negation, the above also 
says that every modal /r-calculus formula equivalent to the CTL'*' formula 



A(G^Po V • • • V Q^pn-i) 

has length at least (|-„/ 2 ])- And, clearly, this property can easily be expressed 
by an alternation-free ^-calculus (AFMC) formula (according to the definition 
of alternation- freeness as introduced by Emerson and Lei in [7]), because it 
can be expressed in CTL. On the other hand, the set of all w-words over 2-^" 
satisfying the linear-time temporal property G^po V • • • V G^p„_i is recognized 
by a nondeterministic Biichi word automaton (NBW) with n -I- 1 states. We 
therefore have: 

Corollary 1. There is an exponential lower bound for the translation NBW i— *■ 
AFMC in the sense of [10]. 

This answers a question left open by Kupferman and Vardi in [10]. 

8 Conclusion 

We have seen that there is an exponential gap between the succinctness of GTL+ 
and CTL, as well as an exponential gap between nondeterministic Biichi word 
automata and alternation-free /i-calculus. Just as in many other situations, the 
automata-theoretic approach to understanding the expressive power of (specifi- 
cation) logics has proved to be useful. 

References 

1. O. Bernholtz [Kupferman] and O. Grumberg. Branching temporal logic and amor- 
phous tree automata. In E. Best, ed., CONCUR’93, vol. 715 of LNCS, 262-277. 
Ill 

2. O. Bernholtz [Kupferman], M. Y. Vardi, and P. Wolper. An automata-theoretic 
approach to branching-time model checking. In D. L. Dill, ed., CAV ’94, vol. 818 
of LAGS', 142-155. Ill, 115, 116, 116, 119 

3. E. M. Clarke, E. A. Emerson, and A. P. Sistla. Antomatic verification of finite- 
state concurrent systems using temporal logic specifications: A practical approach. 
In PoPL ’83, 117-126. Ill 

4. E. A. Emerson and J. Y. Halpern. Decision procednres and expressiveness in the 
temporal logic of branching time. In STOC ’82, 169-181. 110, 111, 111 

5. E. A. Emerson and J. Y. Halpern. Decision procednres and expressiveness in the 
temporal logic of branching time. J. Comput. System Sci., 30(l):l-24, 1985. 110, 
111, 111, 113 

6. E. A. Emerson, C. S. Jutla, and A. P. Sistla. On model-checking for fragments of 
/i-calculus. In C. Courcoubetis, ed., CAV ’93, vol. 697 of LNCS, 385-396. Ill 

7. E. A. Emerson and C.-L. Lei. Efficient model checking in fragments of the propo- 
sitional mu-calculus (extended abstract). In LlCS ’86, 267-278. 120 



CTL'*’ Is Exponentially More Succinct than CTL 121 



8. K. Etessami, M. Y. Vardi, and Th. Wilke. First-order logic with two variables and 
unary temporal logic. In LICS ’97, 228-235. Ill, 111 

9. J. A. W. Kamp. Tense Logic and the Theory of Linear Order. PhD thesis, Uni- 
versity of California, Los Angeles, Calif., 1968. 110 

10. O. Kupferman and M. Y. Vardi. Freedom, weakness, and determinism: From 
linear-time to branching-time. In LICS ’98, 81-92. Ill, 111, 120, 120 

11. L. J. Stockmeyer. The Complexity of Decision Problems in Automata Theory and 
Logic. PhD thesis, Dept, of Electrical Engineering, MIT, Boston, Mass., 1974. 110 

12. Th. Wilke. CTL+ is exponentially more succinct than CTL. Technical Report 
99-7, RWTH Aachen, Fachgruppe Informatlk, 1999. Available online via 

ftp: //ftp. informatik.rwth-aachen.de/pub/reports/1999/index.html. 112 



A Top-Down Look at a Secure Message 



Martin Abadi^, Cedric Fournet^, and Georges Gonthier^* 



^ Bell Labs Research, Lucent Technologies 
^ Microsoft Research 
® INRIA Rocquencourt 



Abstract. In ongoing work, we are investigating the design of secure 
distributed implementations of high-level process calculi (in particular, 
of the join-calculus). We formulate implementations as translations to 
lower-level languages with cryptographic primitives. Cryptographic pro- 
tocols are essential components of those translations. In this paper we 
discuss basic cryptographic protocols for transmitting a single datum 
from one site to another. We explain some sufficient correctness condi- 
tions for these protocols. As an example, we present a simple protocol 
and a proof of its correctness. 



1 Introduction 

In the last few years, the scope of security protocols has grown, and so has their 
complexity. In addition to basic functions such as authentication and key es- 
tablishment, recent protocols sometimes support elaborate transactions. They 
may comprise preliminary negotiations, where the parties discuss their prefer- 
ences and expectations, and layers for application records and for error messages 
(e.g., [15,17]). Gorrespondingly, research on the analysis of security protocols has 
started to address the challenges of those sophisticated protocols. Examples of 
this line of work include the recent analyses of the SSL, TLS, IKE, and SET 
protocols (e.g., [32,29,28,23,22,8]). 

These trends notwithstanding, in this paper we consider only basic protocols 
with an elementary goal. This goal is to transmit a single datum from one site 
to another. The protocols consist of one or more lower-level messages. They 
employ encryption in order to guarantee the integrity and secrecy of the datum, 
and nonces or other tags in order to protect against replay attacks. 

Despite their simplicity, these protocols serve as building blocks for complex 
systems. Relying on these protocols, we can add cryptographic protection to an 
arbitrary program, much as is done in systems with remote invocation facili- 
ties [7]. More precisely, we can translate from a process calculus with primitive 
secure channels (the join-calculus [11]) to a lower-level process calculus where 
communication across sites may take place on public channels and may use cryp- 
tography for security. We have studied such translations in recent papers [3,4]. 
The main purpose of this paper is to show a close-up of an essential part of those 
translations. 

* Partly supported by ESPRIT CONFER-2 WG-21836. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 122—141, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



A Top-Down Look at a Secure Message 



123 



In addition, these protocols provide an example of a helpful top-down ap- 
proach to the specification and verification of security protocols. Following this 
approach, we reduce the problem of implementing the join-calculus to that of 
writing protocols for transmitting a single datum. We isolate a handful of crisp 
correctness conditions on the protocols. These conditions are sufficient for the 
overall correctness of the implementation of the join-calculus. They are not 
strictly necessary, but each of them corresponds to a sensible requirement on 
protocols, they are hard to weaken, and they can all be met. We also consider 
the application of these conditions to specific protocols. In particular, as a di- 
dactic example, we present a new, simple protocol and a proof of its correctness. 

The next section discusses some of the problems that the top-down approach 
is intended to address. Section 3 presents the join-calculus and some extensions. 
Section 4 describes our correctness conditions for protocols. The correctness 
conditions appear in [3] ; the aim of this section is to review them and to explain 
their implications. Section 5 shows our new protocol; section 6 contains the 
corresponding proofs. Section 7 concludes. Thus, this paper is partly a review; 
its main novelties are in informal explanations and in sections 5 and 6. 

2 A Top-Down Approach 

Work on protocols is seldom purely top-down. Protocol designers often proceed 
bottom-up, building systems from cryptographic algorithms and network ser- 
vices. Indeed, some protocols arise as applications of cryptographic primitives. 
For example, Ylbnen designed version 1 of the popular protocol SSH [33] as 
an exercise in the use of RSA; the design of version 2 was “more or less top- 
down” [34]. Therefore, a top-down approach, on its own, is probably unrealistic. 

With this caveat, a top-down approach can serve as a guide. In particular, it 
helps in addressing common confusions about security protocols and their goals 
(e.g., [16]). These confusions often enable “attacks” (scenarios that violate some 
of the expected security properties of a protocol). Although some attacks reveal 
serious flaws in protocols, many alleged attacks are merely the annoying result 
of poor protocol specifications, or of poor understandings of those specifications. 
A top-down perspective helps in distinguishing the dangerous attacks from the 
unimportant ones, and in avoiding the former. 

Those confusions arise even in the analysis of elementary protocols for key 
exchange. For example, suppose that, after running a certain protocol, two par- 
ties A and B are supposed to share a session key. Suppose further that an attacker 
can arrange that A and B end up with two different keys, but does not know 
those keys. Surely, this scenario violates the key-establishment goal of the proto- 
col. On the other hand, the scenario may be harmless. When A and B attempt 
to communicate after running the protocol, they are most likely to discover the 
discrepancy if their subsequent encrypted messages contain some checkable re- 
dundancy. Then A and B are likely to rerun the protocol, and eventually they 
may agree on a key. At worst, the attack will result in a loss of liveness, but not 
a loss of safety. (Lowe has considered other attacks with similar properties [19].) 



124 



Martin Abadi et al. 



Those confusions become more delicate for current, complex protocols. For 
example, suppose that A and B start out by discussing which cryptosystem 
to adopt, using some cleartext messages. Suppose further that an attacker can 
tamper with this negotiation, convincing A and B to use a cryptosystem of its 
choice. If A and B adjust the contents of the subsequent conversation to the 
outcome of the negotiation, then this attack may entail only a loss of liveness: A 
and B may not send certain sensitive messages. However, this attack could also 
cause the inappropriate use of a poor cryptosystem, and failures of authenticity 
and secrecy (e.g., [32]). 

Characteristically, these problematic scenarios are vague on the context in 
which a protocol operates, and on the use of the protocol — for example, how 
retries happen, or what sorts of application data are sent after a key of a certain 
type is established. The difficulties diminish or disappear if the protocol is seen 
as part of a larger system. An attack on the protocol is significant only if it has 
an effect on the behavior of the larger system. 

For our purposes, the larger system is an arbitrary join-calculus program (or 
a lower-level translation of this program). Top-down, we go from the semantics 
of the join-calculus to the design of particular protocols; if a protocol is correct 
according to our conditions then it can serve as a building block for the im- 
plementation of the join-calculus. Every potential attack on a protocol can be 
assessed against the conditions, unambiguously. 

In a more restricted view, the larger system could be one that serves a spe- 
cific purpose, for example a file system, rather than an arbitrary join-calculus 
program. This specialization may allow more efficient protocols; it also gives rise 
to the danger that the protocols will be used in unintended, inappropriate ways. 

A top-down perspective is not unique to our work. In particular, recent papers 
by Bellare, Canetti, and Krawczyk [6] and Lynch [21] concern modular methods 
for designing and analyzing protocols. Although those works are largely disjoint 
from ours in techniques and results, they seem to have similar conceptual basis 
and potential benefits. A salient characteristic of our work is that it treats cryp- 
tographic operations as black boxes (cf. [6,18]). A further refinement would be 
to replace the black boxes with particular cryptographic algorithms. While this 
refinement is a natural continuation of our approach, it may be quite hard, and 
we have yet to study it. 

3 The Join-Calculus and Its Extensions 

Next we describe the join-calculus and some additions for representing cryp- 
tographic operations and a public network. This review is brief; we refer the 
reader to previous papers for details on the join-calculus, its theory, and its 
applications [11,12,14,13,10,3,4]. 

3.1 The Join-Calculus 

In the join-calculus, processes communicate through named, one-way channels. 
Intuitively, the channels of the join-calculus have built-in security properties: 



A Top-Down Look at a Secure Message 



125 



~ As in the pi-calculus [ 26 , 27 ], the name of a channel is a transferrable but 
unforgeable capability. 

— A process that sends a message on a channel must have the name of the 
channel. 

— Only the process that creates a channel can receive messages on the channel. 



We use lowercase identifiers x, y, foo, bar . . . for names. In addition to a 
category of names, the syntax of the join-calculus includes categories of val- 
ues, processes, definitions, and join-patterns. These are defined in the following 
grammar, where we write v for a tuple of values vi,V2, ■ ■ ■ ,Vn- 



V ::= 

X 

P ::= 

x{v) 

I def D \n P 
I \f V = v' then P else P' 
\ P\P' 

I 0 

D ::= 

J\>P 
I D ^D' 

J ::= 

x{y) 

I J\J' 



values 
name 
processes 
message 
local definition 
comparison 
parallel composition 
null process 
definitions 
reaction rule 

conjunction of definitions 
join-patterns 
message pattern 
join of patterns 



Processes have the following informal semantics: 

— a; (If) sends the tuple of values 1 ; on a;, asynchronously. 

— def Z? in P is the process P with the local definitions given in ZZ. In def ZZ in 
P, the channel names defined in D are recursively bound in the whole of 
def D in P, with lexical scoping. 

— V = v' then P else P' tests whether v = v' ^ then runs the process P or the 
process P' depending on the result of the test. 

— P I P' is the parallel composition of the processes P and P' . 

— 0 is the null process, which does nothing. 

A definition J> P says that the process P may run when there are messages 
that match the join-pattern J. For example, let D be the definition: 



{xi{yi) I X2{y2,y3))>{xi{yi) \ z{yi,y2,y3)) 



This definition introduces two channels, with names a;i and X2- The names j/i, y2, 
and 2/3 are also bound; they are formal parameters that may be instantiated to 
actual values received on xi and X2- The name z is free. When a message xi{vi) 
appears on X\ and a message X2{v2,V3) appears on X2, this definition may fire, 
consuming both messages, reproducing X\ {v\)^ and producing z{v\,V2, fa). Thus, 
def D in a;i(fi) | X2{v2,V3) yields def D in a;i(fi) | z{vi,V 2 ,V 3 ). 



126 



Martin Abadi et al. 



3.2 Syntactic Extensions 

It is convenient to introduce some syntactic extensions that do not affect the 
expressiveness of the calculus. We write repi P for the replication of P, which can 
be defined as def x{) >(P | x()) in x{) for a fresh x. In addition, we have notations 
for data structures that contain unique identifiers. We write the declaration uids t 
as the definition of an initially empty set t; and write if not tset t(c) then P for 
a process that atomically tests whether c is in t, and if not adds c to t and then 
triggers the execution of P. (The test-and-set must be atomic, but the execution 
of P need not be.) 

3.3 Cryptographic Primitives 

We define the sjoin-calculus, which is analogous to the spi-calculus [5], by en- 
riching the join-calculus with a few constructs: 

— fresh X is a definition that introduces the fresh name x; this name may for 
example be included as a unique identifier within a message. 

— keys x“*',x“ is a definition that introduces a pair of keys, for public-key 
encryption [24]; x+ is an encryption key (the “public key”) and x~ is the 
inverse decryption key (the “secret key”). 

— {v}^ is a value that represents the result of encrypting v with key v. The 
inverse of v should be used for decryption. 

— decrypt v using v' to x in P else P' is a process that attempts to decrypt v 
using v' as key. If the decryption succeeds, then P runs, with the results of 
the decryption substituted for x; otherwise, P' runs. We may omit P' when 
it is 0. 

With these additions, we can represent systems where low-level processes use 
cryptography for security. 

3.4 A Public Network 

In addition, we need a model of a public, asynchronous network over which 
low-level processes communicate. This network should allow an attacker to in- 
tercept, modify, duplicate, and inject messages. Sometimes we may also wish to 
assume that the attacker has access to certain public keys without knowing the 
corresponding secret keys. Therefore, we define the contexts: 

def rect;(«:) j ernit{rn) \> ti{m) in [•] 

£ [plug {emit, recv) | IE | • ] 
def keys x“'',x“ in £nv [plug'{x~'~) \ ■ ] 

where: 

— emit and recv represent the network interface. For output, a process sends 
its message on emit. For input, it sends a continuation channel k on recv, 
and the network may return a message on k. 



£[■] = 
£nv[-] = 
£nv,^± [ • ] = 



A Top-Down Look at a Secure Message 



127 



~ plug and plug' are auxiliary channels whose sole purpose is to make certain 
public names {emit, recv, a;"*") available to outside processes. 

— LL is the process repi (def fresh m in repi emit{m)), which repeatedly puts 
fresh messages on the network, as background noise, helping protect against 
traffic-analysis attacks. 

Intuitively, £nv [ ■ ] represents the network; £nv [ P ] describes a situation where 
both a process P and any process running in parallel may use emit and recv. 
The other contexts play auxiliary roles. 

3.5 Operational Semantics and Types (Notation) 

The operational semantics of the join-calculus and its extensions define the fol- 
lowing notations and concepts [ 2 ]. 

— P lx holds if P may output on x immediately. For example, x{v) 

— P holds if P may output on x either immediately or after some internal 
reductions. For example, (def y{)t>x{) in y{)) (Ij,. 

— P = Q holds if P and Q are structurally equivalent, that is, if P and Q differ 
only by certain simple syntactic rearrangements. For example, for all Pi 
and P2, Pi I P2 = P2 I Pi- We also take repi P = (P | repi P). 

— P ^ Q holds if P may reduce to Q, that is, if P may perform one step of 
internal computation and then behave as Q. For example, 

def y() i>a;() in ?/() ^ def y() t> a;() in a;() 

The relations and are the reflexive closure and the reflexive-transitive 
closure of — >■, respectively. 

— An evaluation context C[-] is a context in which computation may imme- 
diately take place: if P ^ Q then C[P] — > C\Q]. For example, Pi | - is an 
evaluation context, while decrypt v using v' to ir in • else P2 is not. 

We rely on simple, monomorphic type systems for the calculi. In these type 
systems, each channel has an associated arity (an integer size for the tuples of 
values transmitted on the channel). We write (ri, . . . , r„) for the type of channels 
that carry tuples with n values of respective types ri, . . . , r„. We allow types to 
be recursively defined (formally, using a fixpoint operator), so we may have for 
example r = (r, r) . 

In addition to channel types, we have a basic type BitString. This is the type 
of keys, ciphertexts and their contents, and of the fresh names introduced with 
the construct fresh • . 

We assume that each name is associated with a type (although we usually 
keep this type implicit) , and that there are infinitely many names for each type. 
Throughout, we consider only we 11 - typed processes. 

4 Correctness 

In this section we arrive, more or less top-down, at the problem of transmitting 
a datum securely. We present correctness conditions for protocols that solve the 
problem. 



128 



Martin Abadi et al. 



4.1 Goals 

Now we can write both pure join-calculus processes, where security is based on 
the scoping of channel names, and processes that communicate using cryptogra- 
phy over a public network (as represented by £nv [ • ]). Our objective is to see the 
latter as implementations of the former. Moreover, we wish to do this systemat- 
ically: we would like to have compilers that map pure join-calculus processes to 
lower-level code that can execute securely over a public network. 

In order to formalize the security requirement for such compilers, we resort 
to process equivalences. We say that two processes Pi and P 2 are equivalent, 
and write Pi ~ P^, when no context can distinguish one from the other [25,9]. 
Intuitively, we may view the context as an attacker; then Pi w P 2 entails an in- 
tegrity property (limiting the effect of the attacker on the behavior of Pi and P 2 ) 
and a secrecy property (limiting the observations of the attacker). Formally, we 
define « as the largest symmetric relation TZ on processes such that: 

1. if P TZ Q and P then Q 

2. 7^ is a congruence for all evaluation contexts, that is, for all evaluation con- 
texts C[ • ], if PTZQ then C[P] TZ C[Q\, 

3. 7?. is a weak bisimulation, that is, if P TZ Q and P — s-* P' then, for some Q' , 
P' TZ Q' and Q Q' . 

When we devise implementations of the join-calculus, we wish to preserve 
equivalences, since equivalences can express security properties [5,1]. More pre- 
cisely, if Pi and P 2 are equivalent join-calculus processes, we wish to compile 
them to processes and P 2 such that £nv[P[] and £nv[P 2 ] are equivalent. 
The mention of £nv [ • ] accounts for the use of the public network by P[ and P 2 ■ 

4.2 Protocols 

A crucial part of implementing the join-calculus is mapping join-calculus mes- 
sages to cryptographic protocols. Following an obvious strategy and postponing 
optimizations, we associate a pair of keys x~^ ^x~ with each join-calculus chan- 
nel X. 

— The encryption key corresponds to the capability of sending messages 
on X, which may be transferred. 

— The decryption key x~ corresponds to the capability of receiving messages 
on X, which only the creator of x has. 

In order to simulate communication on a join-calculus channel x, we employ 
protocols that consist of two processes Ex\v\ and Rx, for sending and receiving, 
respectively. 

— Using the key x~^ for encryption, Ex\v\ sends v. Within v, an encryption 
key y~^ (rather than a pair y~^,y~) represents the corresponding channel y. 

— Using the key x~ for decryption, Rx receives messages, then forwards their 
cleartext contents on an auxiliary, internal channel x° . 



A Top-Down Look at a Secure Message 



129 



For example, a protocol might be: 

= emit{{v}^+) 

Rx = def k(to) t> decrypt m using a;“ to y in a:°(^ 
in recv{n) 

where the length of y is deduced from the type of x. For instance, using this 
protocol, the join-calculus process x{y) may be mapped to emit{{y'^}^+). This 
naive protocol is subject to message replays and other obvious attacks. The 
protocol of section 5 thwarts those attacks. 

In general, such protocols should guarantee the integrity and secrecy of v. 
These properties are informally appealing, and they are formally necessary for 
the desired preservation of equivalences. Section 4.3 gives a more precise and 
complete list of properties. 

4.3 Correctness Conditions for Protocols 

In the following definition, a protocol is a pair {Rx, Ex[-]) consisting of a process 
for receiving and one for sending, parameterized by a channel name x. The 
definition relies on a set R of derivatives of the receiving process Rx- This set 
represents the different states of a receiver after interaction with its context. 

The definition also relies on an expansion relation on processes (^) [31], 
which is similar to « but stronger and asymmetric. We define ^ as the largest 
relation TZ such that TZ and its converse meet requirements (1), (2), and (3) of 
the definition of «, and such that if P TZ Q and P ^ P' then, for some Q' , 
P' TZ Q' and Q^= Q'. 

Definition 1. The protocol {Rx-, Ex[-]) is correct if there is a set of processes R 
that satisfies the following conditions- 

1- Rx € R. 

2- The free names ofEx[-] are at most emit, recv, and names of type BitString. 
For every i? S R, the free names of R are at most emit, recv, x°, and names 
of type BitString. 

3- For every R ^R, it is not the case that R I emit or that R • 

4- - For every R € R, for every tuple v of values of type BitString whose length 

matches the arity of x, if x~ does not occur in v, then 

£nVx± [ .R I [^^ ] ^ £nVx± [ R I x° (v)\ 

5- For every value v, if x~^ and x~ do not occur in v and 

£nVx± [Rx I emit{v)] — > P 



then 



P P £nvx± [Rx I emit{v) ] 



130 



Martin Abadi et al. 



6. For every R G R, if x does not occur in v and 

£nv^± [R I emit{v) ] — >■ P 



then 

P ^ £nvx± [R' I Q] 

for some R' gR and some process Q such that x~ does not occur in Q. 

Condition 1 says that the initial receiving process, Rx, is in R. 

Condition 2 restricts the free names in use in the protocol. The sending pro- 
cess Ex['] may have access to the network-interface channels {emit and recv). 
The receiving process Rx and all other processes in R may have access to those 
channels and also to x° . The requirement that x° does not occur free in Ex\-] 
rules out degenerate protocols where the sending process does the work of the 
receiving process, like the protocol (0,a:°(-)). In addition to emit, recv, and x° , 
the protocol may rely on names of type BitString. Intuitively, condition 2 implies 
that communication from Ex\-] to Rx is limited to messages of types BitString 
and (BitString) on the channels emit and recv. Therefore, the protocol can be 
directly implemented over an ordinary network like that represented by the chan- 
nels emit and recv, without any additional assumptions about physical security 
or out-of-band communication. 

Condition 3 says that every process i? € R is passive, in the sense that it does 
not send messages on emit or x° spontaneously. This condition still allows R to 
send messages on recv. 

Condition 4 says that the protocol transmits messages reliably and secretly 
when an instance of the sending process Ex [ ■ ] is put in parallel with the receiv- 
ing process Rx or any other process R G R. Using the expansion relation, this 
condition compares R \ Ex[v] with R \ x°{v). The former process has Ex[v] as 
a component, while in the latter Ex\v\ is replaced with its intended outcome, 
namely x° (v) (with no other visible effect) . In both processes we have the same 
component R. Thus, this condition implies that the state of the receiving process 
remains essentially unchanged as long as it interacts with regular sending pro- 
cesses; any state change in the receiving process, such as the addition of entries 
in internal tables, should not be observable. (Condition 4 is slightly stronger 
in [3], where the assumption that x~ does not occur in v is missing.) 

Condition 4 rules out insecure protocols that leak information in the course 
of transmitting a message; many obviously insecure protocols fall into this cat- 
egory. In particular, the naive protocol of section 4.2 violates condition 4 on 
several counts. For example, a context that intercepts the message from Ex^ 
and listens on x° can differentiate £nVx± [R | ] and £nVx± [R \ ]• Fur- 

thermore, a context that interacts with £nVx± [ R \ Ex [F] ] may be able to guess v, 
then confirm the guess by computing {F},j,+ and comparing {v},^+ with the mes- 
sage from Ex\y\. In contrast, the context cannot obtain the same information in 
interaction with £nVx± [R \ x°{P)]. 

The last two conditions (5 and 6) describe interactions between the receiving 
process and the context. These conditions are needed in addition to condition 4 



A Top-Down Look at a Secure Message 



131 



because the context might not behave as a regular sending process. Since the 
context may communicate with the receiving process only through messages of 
the form emit{v), the last two conditions describe the behavior of the receiv- 
ing process in reaction to such a message. The conditions concern two cases, 
distinguished by whether the encryption key x~^ occurs in v. 

Condition 5 describes the behavior of in a context that does not have 
access to the keys x~^ and x~ . Intuitively, this behavior is that exhibited when 
the attacker has not yet been given x~^ , so the receiving process should remain 
essentially invisible. For every reduction £nv^± [i? | emit{v)] P, expansion 
relates the outcome P to the initial state £nv^± [R \ emit{v)]. Thus, if Rx takes a 
message from the network and the message is unrelated to x~^ , x~ , then Rx must 
resend the message. Similarly, if Rx becomes a process R by internal reductions 
(so P equals £nVx± [i? | emit{v) ]), then it must be possible to go back from R 
to Rx obtaining that £nVx± [R \ emit{v)] >: £nVx± [Rx \ emit{v)]. 

For example, condition 5 excludes a protocol where Rx “swallows” all mes- 
sages that are not encrypted under x~^ , thus revealing its presence. It also ex- 
cludes a protocol where Rx “swallows” messages selectively, possibly revealing 
sensitive information. 

Condition 6 describes the behavior of a receiving process i? S R in a context 
that has access to the encryption key x~^ but not to the decryption key x~ . 
Intuitively, this behavior corresponds to the case where the attacker has been 
given the encryption key, and can thus cause messages on a;°; in this situation, 
the attacker should still not interfere with messages from other senders. The re- 
duction £nVx± [i? I emit{v) ] — *■ P may change the state of the receiving process, 
for example by completing a run of the protocol and relaying a message on x° . 
The process P cannot however be arbitrary: it must be in the expansion relation 
with a process that includes a new receiving process i?' G R in place of R, in 
parallel with a process Q in place of emit{v). The process Q may contain emis- 
sions on x° and emit; it typically consists of parts of residues of R that do not 
need the decryption key x~ any more. 

For example, condition 6 excludes an insecure variant of a correct protocol 
where the receiving process is modified as follows. In answer to messages of a form 
that a regular sending process never creates (for example, {c, c, c}^+ for the pro- 
tocol of section 5), the receiving process emits a fresh value u; the receiving pro- 
cess works correctly for input tuples that do not contain u, but leaks input tuples 
that contain u. A flaw appears only after the creation of u, which regular sending 
processes never cause. Thus, all other conditions can be met. Nonetheless, an 
attacker may obtain u, pass u to some third party, and harvest secrets later on 
if the third party includes u in messages to the receiving process. This variant 
violates condition 6: consider the reduction £nvx± [R \ emit{{v} ] — > P that 
will cause the creation of u; then condition 6 requires that P >: £nVx± [R' \ Q] 
for some R' G R, while no such R' can satisfy all the other conditions. 

As mentioned above, these correctness conditions suffice for our results about 
the implementations of the join-calculus. Compared to common specifications of 
cryptographic protocols (e.g., [20]), our correctness conditions are rather exten- 



132 



Martin Abadi et al. 



sional [30] : they concern the intended effects of a protocol rather than its internal 
behavior. 

5 A Protocol 

In this section, we describe a specific protocol for transmitting a single datum. 
This protocol is an instructive example: it is simple and correct; moreover, its 
correctness is not too hard to prove. On the other hand, this protocol is not very 
practical. Two other correct protocols appear in [3]; they are somewhat more 
complex and realistic. In current work (discussed briefly in section 7), we are 
making further efficiency improvements. 

Our simple protocol is an enhancement of the naive protocol in several re- 
spects: 

— Each encrypted message contains a fresh component c, which serves as a 
confounder (making the message unpredictable and different from other mes- 
sages with the same payload) and as a unique identifier (against replay at- 
tacks). 

— Each encrypted message is repeated indefinitely, in case some copies are 
intercepted. 

— When a process receives a message for the first time, it reemits the mes- 
sage and records it before further processing. Later, it reemits but ignores 
duplicates of this message. 

The first correct protocol of [3] is similar in these respects, but it does fewer 
reemissions and uses records of unique identifiers instead of records of complete 
messages. 

Since calls to recv retrieve messages from the network non-deterministically, 
and since messages may be duplicated by the sender or by an attacker, processes 
have to filter for messages that are destined for them. This filtering relies first on 
a table for discarding duplicates, then on a decryption key for accessing message 
contents. We arrive at the following definitions. 



Ex[^ = def fresh c in repi emit{{c, 'c} 3 ,+ ) 

Rx = Rx{} [ 0 ] 

Rx{u} [P] =* def uids tx{u} in 

def K{m) t>emit{m) \ Fm,x in P \ repI recv{n) 

Fm,x = if not tset tj,(m) then 

Fm,x = decrypt m using x” to c,y in a;°(^ 

where the length of the tuple y in the definition of ^ is deduced from the type 
of X. In the receiving process, the component Fm,x serves as a filter for a single 
message; such components are replicated, and share a set of previously received 
messages. 



A Top-Down Look at a Secure Message 



133 



Theorem 1. The protocol {Rx,Ex\-]) is correct. 

The proof of this theorem is the subject of the next section 

6 A Correctness Proof 

The proof of Theorem 1 relies on a series of lemmas that describes the various 
stages of communications processing (reception, duplicate elimination, decryp- 
tion). Proofs for more sophisticated protocols have similar structure, though 
each of the steps becomes harder. 

Throughout, we employ several techniques for establishing equivalences and 
other relations between processes. These techniques are not specific to cryptog- 
raphy and appear in lemmas in [2]; here, we only present their role informally. 
We also use the equivalence relation x, which is stronger than « and is obtained 
from the definition of « in section 4.1 by substituting — for ^*. The following 
“up-to” technique is helpful for proving x. In order to show that 72. C x, it 
suffices to prove, for all P and Q such that P TZ Q, that: 

1. if P ix then Q jJ-j,; conversely, if Q ix then P U-j,; 

2. C[P] x72x C[Q] for every context C[-] of the form def D \x\ R \ [•] such 
that the names bound in P and Q do not occur in D and R; 

3. if P P' , then Q — Q' and P' x72^x Q' for some Q'; conversely, if 
Q ^ Q' , then P — P' and P' x72^x Q' for some P' . 

(Here, 72^ is the reflexive closure of 72 and x72x is its composition with x on 
both sides.) 

We focus on a fixed channel x and the associated names x~ , and x° . In 

the remainder of this section, we assume that these names and emit and recv 
are never alpha-converted. The up-to proof techniques used in this section allow 
this assumption. 

We say that a value m is a well- formed message when it is of the form {c, v}x+ 
for some values c,v of type BitString such that the length of v is the arity of x. 
A process P is an internal state when it is a parallel composition of processes 
K{m), Fm,x, or for some BitString values m in which x~ does not appear. 

A net state is a process of the form: 

£ [def keys x~^ ,x~ A 77 in Rx{u} [P] | Q] 

where D does not define emit, recv, a;"*", or x~ , the tuple u contains pairwise 
distinct BitString values, P is an internal state, and x~ does not appear in D, Q, 
or u. For any net state S, there exist a unique context C[ ■ ], tuple u, process Q, 
and internal state P such that S = C[Rx{u} [P] | Q]- Moreover, u and P can 
be derived (uniquely up to alpha-conversion and reordering for u and = for P) 
from any T = S, because Rx{u} [P] must contain all occurrences of x~ and k 
in S. 

The first lemma says that net states are closed under application of evaluation 
contexts and under reduction. 



134 



Martin Abadi et al. 



Lemma 1 (Closure). Let S be a net state. For any evaluation eontext E[-] in 
whieh names bound in S do not appear, we have = S' for some net state S' . 
For any reduetion step S ^ T, we have T = S' for some net state S' . 

Proof. Let S = C [Rx{u} [P] \ Q], with C[ • ] =8 [def keys x~ A D \n ■ ]. For 
the first claim, it is enough to consider contexts E[-] of the forms ( • | Q') and 
(def D' in • ). We obtain S' from S by replacing Q with Q \ Q' or D with D A D', 
respectively. For the second claim, we base our analysis on the location of the 
processes involved in the reduction step, and further on the type of reduction if 
the reduction is local to Rx{u} [P]: 

— If the step involves only processes in Q (possibly with definitions in C'[-]), 

then it leaves Rx{u}[P] unchanged, so there must be D' , Q', and C'[-\ 
such that C'[-] = £ [def keys A P' in • ], T = C'[Rx{u}[P] \ Q'], 

and C[Q] C'[Q']. Therefore, x~ does not appear in D' and Q' , so we take 
y = C [Rx{u} [P] IQ']. 

— If the step involves processes in both Q and Rx{u} [P], then it must use 
the rule in £ [ • ] to produce a message K{m) from a message emit{m) in Q 
and a message recv{K) unfolded from Rx{u}[-]. We can assume that this 
message emit{m) does not appear under any definitions in Q, since any such 
definition can be moved to D. Hence we have Q = emit{m) \ Q' for some 
process Q', so we take S' = C [Rx{u} [P \ n{m)] \ Q'j. 

— If the step replaces a message K{m) in P with emit{m) \ Pm,x using the rule 
in Rx{u} [•], then P = P' | «;(m) for some internal state P', so we take 
S' = C [Rx{u} [P' I Pm,x] I {emit{m) \ Q)]. 

— If the step executes the test-and-set in some process Pm,x in P, entering m 
in the table for x, then P = P' | Pm x for some internal state P', so we take 
S' = C[Rx{u,m}[P'\Pf,J IQ]. ’ 

— Otherwise, the step evaluates the decryption in some process Pm,x in 
yielding x°(fv) if m is a well- formed message {c,v},^+ and 0 otherwise, and 
P = P' I P(]^ 2. for some internal state P', so if m = {c,v},^+ we take 
S' = C [Rx{u\ [P'j I (x°{v) I Q)j, and otherwise S' = C [Rx{u} [P'j | Qj. □ 

Lemma 2 shows how a receiving process Rx{u} [P] incorporates a message 
from the public network. The message is immediately reemitted, independently 
of its contents. 

Lemma 2 (Reception). If C [Rx{u} [P] | {emit{m) \ Q)] is a net state, then: 
C[Rx{u} [P] I {emit{m) \ Q)] C[Rx{u} [^(m) | P] | Q] 

^ C[Rx{u} [Pm,x I P] I {emit{m) \ Q)] 

Proof. In more detail, we have: 

C[Rx{u} [P] I ( emit{m ) I Q)] = C[Rx{u} [(emit{m) \ recv{K)) | P] | Q] 

^C[Rxm [n{m) \P]\Q] 

C[Rx{u} [emit{m) \ Fm,x I I Q] 

= C[Rx{u} [Fm,x I P] I (emit{m) \ Q)] 



A Top-Down Look at a Secure Message 



135 



The initial structural equivalence unfolds a copy of the replicated process recf (k) 
within Rx{u} [ ■ ] and groups this process with emit{m). The first reduction step 
replaces the process emit{m) \ recv{K) with the process K{m) according to the 
rule defining emit and recv within £[ ■ ]■ The second reduction step consumes 
K(m) according to the rule within Rx{u} [•]. The final structural equivalence 
moves emit{m) back to its original position. 

The second step depends only on the deterministic definition of k. Using a 
standard lemma of the join-calculus, we can substitute ^ for ^ in this case. □ 

Similarly, Lemma 3 deals with a replicated message. In this case, one copy of 
the message always remains available, so the first reduction of Lemma 2 is also 
an expansion. 

Lemma 3 (Reception of a replicated message). Let C'[-] he a eontext of 
the form C \Rx{u} [• | U] | ((repi emit{m)) \ Q)], sueh that C"[0] is a net state. 

Then C'[0] ^ C'[Fm,x\- 
Proof. We have: 



C'[0] = C'[emit{m)] 

— > C'[K{m)] 

C'[emit{m) \ 

= C'[Fm,x\ 

The first structural equivalence unfolds a copy of the replicated message emit{m) . 
The two reduction steps apply the rules for emit, recv and k, respectively, as in 
Lemma 2. The final structural equivalence folds back the copy of emit{m). 

Next, we show that the reduction sequence described above is also an expan- 
sion. Given some BitString value m, we let TZ be the relation that contains all the 
pairs of processes C'[Qi\,C'[Fm,x] related by the lemma. We show that F C By 
construction, we have TZ C — so we can apply the special case of the expansion- 
up-to-expansion proof technique (Lemma 3 of [2]) to analyze two processes C"[0] 
and C'[Fm,x] such that C"[0] TZ C'[Fm,x]- That proof technique requires match- 
ing the reduction steps of C"[0] with those of C'[Fm,x], and showing that TZ is 
closed under application of an evaluation context F[ • ] in which variables bound 
in C"[0] or C'[Fm,x] do not appear. Every step C"[0] ^ T can obviously be 
matched by a step of C'[Fxn,x\, since the inert term 0 cannot take part in the 
step: there is a context C"'[ • ] such that T = C"[0] and C'[Fm,x] — > C''[Fm,x]- By 
Lemma 1 we can choose G"[-] so that C''[Fm,x] is a net state, whence C"[0] is 
one as well. The term repI emit{m) cannot disappear in the step, so it must occur 
in the ‘Q’ part of C"\-]. As in the proof of Lemma 1, we can choose C"[-] so 
that this part has the form (repi emit{m)) \ Q' , so C"[0] TZ C''[Fm,x\- A similar 
argument shows that TZ is closed under application of evaluation contexts. □ 

Lemmas 4 and 5 describe the following steps of communications processing. 
They concern the freshness of a message and its well-formedness, respectively. 



136 



Martin Abadi et al. 



Lemma 4 (Duplicates). If C\Rx{u] [P]] is a net state, then: 

C [Rx{u} [P I P™.,]] P C [P,{il,m} [P I F:,^] (1) 

C [Rx{u, m} [P I Fm,x]] ^ C [Rx{u, m} [P]] (2) 

Proof. This follows from the definition of tables of unique identifiers, standard 
up-to proof techniques, and Lemma 1. In the first relation, there may be other 
copies of Fm^x in P attempting to enter m in the table for x, but each copy yields 
the same process P^.^i so the choice of a particular copy is not observable. □ 

Lemma 5 (Decryption). If is a net state, then C'[P4, 3 ,] P C[x° tfu)] 

when m is a well-formed message {c,v}„,,+ and C[Ff^ x C[0] otherwise. 

Proof. This follows from the definition of decryption in the sjoin-calculus, stan- 
dard up-to proof techniques, and Lemma 1 . □ 

Composing these last three lemmas in the case where there is an emitter 
Ex[^, we can summarize a successful run of the protocol as follows: 

Lemma 6 (Completion). If S = C\Rx{u} [P] \ {Ex^ \ Q)] is a net state, c 
is a BitString name that does not appear in S, and m = {c, is a well-formed 
message, then: 

S = def fresh c in C[Rx{u} [P] | ((repi emit{m)) \Q)] 

y def fresh c in C[Rx{u,m} [P] | (x°{v) \ (repI emit{m)) \ Q)] 

Proof. We simply unfold the definitions, use structural equivalence, and succes- 
sively apply Lemmas 3, 4 (first part), and 5 (first case). The first structural 
equivalence is obtained by extending the scope of the new unique identifier c out 
of Ex[v\ = def fresh c in repi emit{{c,v} by hypothesis, the name c does 
not appear elsewhere in the initial process, so there is no capture as its definition 
is lifted outside. As we apply Lemma 4, the replicated message is different from 
any value appearing in the table for x because no such value may contain c. □ 

Lemmas 7 and 8 state that what is left after a successful run of the protocol 
is indistinguishable from noise, and can be discarded up to x. Lemma 7 says 
that a well-formed message m that appears in the table for a channel x can be 
uniformly replaced with a fresh name d. Lemma 8 says that an ill-formed value 
can be discarded from the table. 

Lemma 7 (Noise). If S = C [Rx{u, d} [P]] is a net state, d is a BitString name 
that is not hound in S, P^ ,,, does not appear unguarded in P, c is a BitString 
name that does not appear in S, and m = {c, ri}^+, then: 

def fresh c in {S{"^/d}) x def fresh d in S' 

Proof. We let TZ be the relation that contains all pairs of processes equated in the 
lemma, and prove that P C x. Every reduction commutes with the substitution; 
in particular: 



A Top-Down Look at a Secure Message 



137 



~ the substitution {"*/d} is an injection from values in which c does not appear 
to values in which d does not appear, so every reduction step that depends 
on the comparison of two values selects the same branch before and after 
the substitution; 

~ S does not attempt to decrypt d with x~ , and S{'^/d} does not attempt to 
decrypt m with x~ , since ^ does not appear unguarded in P and x~ does 
not appear elsewhere in S. 

By Lemma 1, if S' ^ S' then we can take S' to be a net state. Furthermore, F'^ ^ 
cannot appear unguarded in S': the processes that may appear in P are 
inert, because d already appears in the table. It follows that TZ is closed under re- 
duction. Closure under application of evaluation contexts follows from Lemma 1. 
Finally, does not operate on channel names, so for every channel v we have 

Si.iLndonlyif^n.li,. 



Lemma 8 (Table simplification). If C [Rx{u,v} [P]] is a net state where v 
is not a well-formed message, then C [Rx{u,v} [P]] x C[Rx{u\ [P]]. 

Proof. We let TZ be the relation that contains all pairs of processes equated in 
the lemma, and prove that P C x. To establish the bisimulation requirement, 
we compare the reductions of processes that TZ relates. In the case where P = 
Fv,x I P' for some internal state P', a reduction step is enabled only on the 
right-hand side: 



C[Rx{u} [Fx,x I P']] - C [Rx{u,v} [P;, I P']] 

The left-hand side does not need to match this step, as the second part of 
Lemma 4 and the second case of Lemma 5 give us: 

C [Rx{u,v} [Fx,x I P']] X C [Rx{u,v} [P']] X C [Rx{u,v} [P;, | P']] 

In all other cases, reductions are the same on both sides, and lead to related net 
states by Lemma 1. Closure under application of evaluation contexts is likewise 
a direct consequence of Lemma 1. Finally, immediate outputs are the same on 
both sides of TZ. □ 

We are now ready to prove the correctness requirements for our protocol 
(Theorem 1): 

Proof. We let R consist of all terms Rx{u} [0] where u is a tuple of pairwise 
distinct values of type BitString in which x~ does not appear. Thus, R consists 
of derivatives of Rx with different contents in the table of unique identifiers. 

1: By definition, Rx = Rx{} [0]. 

2, 3: These conditions are syntactically obvious for Ex[-] and for every P € R. 



138 



Martin Abadi et al. 



4: We must prove that 



£nv^± [ i? I [u] ] ^ £nv^± [R \ a;° (u) ] 



for every i? S R and every tuple v of BitString values whose length matches 
the arity of x and in which x~ does not appear. We derive this expansion 
relation by composing Lemmas 6, 7, and 8. Let R = Rx{u} [0], let c and d 
be fresh BitString names, and let m = {c,v}^+; we have: 



£nvx± [Rx{u} [0] I 

^ def fresh c in m} [0] | (repi emit(m)) |x°(u)] (1) 

X def fresh d in £nVx± \ Rx{u,d} [0] | (repi emit{d}) \ x°{^] (2) 

X def fresh d in £nVx± \ Rx{u} [0] | (repI emit{d)) \ x°{v}] (3) 

= £nvx± [Rx{u} [0] I x°{v) ] (4) 



Let Qo = W \ plug {emit, recv) \ plug'{x~^). The expansion (1) is obtained 
by applying Lemma 6, taking C[ • ] = £ [def keys x~^,x~ in • ] and Q = Qo- 
The relations (2) and (3) are obtained by applying Lemmas 7 and 8, respec- 
tively, taking C[-\ = £[def keys a:+ , a:“ in • | {x° (v) \ (repi emit{d)) \ Qo)]- 
The structural equivalence (4) is obtained by first restricting the scope of d 
to the process repi emit{d) — by hypothesis, d does not occur elsewhere — then 
by using the structural equivalence W= W \ def fresh d in repi emit{d). 

5: Suppose that 

£nVx± [Rx I emit{m) ] Q 

and that x~^ and x~ do not occur in m. The only possible reduction of 
£nVx± [Rx I emit{m)] is the first reduction shown in Lemma 2, where a 
message is consumed and turned into a message on a continuation channel. 
The message can be either emit{m) or a copy of emit{d) unfolded from W. 
We treat only the first case; the second one is similar. Since x~^ does not 
occur in m, it follows that m cannot be a well-formed message for x. We 
thus have: 



Q = £nvx± [7?a;{} [k(w)] ] 

^ £nVx± [Fm,x] \ emit{m)] 

^ £nvx± \Rx{'fn\ [T’m a;] | emff(m)] 

X £nVx± [Rx{m} [0] | emit{m) ] 

X £nVx± [Rx I emit{m) ] 

by successively applying Lemmas 2, 4 (first part), 5 (second case), and 8. 
6: We prove that if Rx{u] [0] € R, X does not occur in m, and 

£nVx± [Rx{v] [0] I emit{m)] Q 



then 



Q y £nVx± [R' [T] 



A Top-Down Look at a Secure Message 



139 



for some R' G H and some process T such that x~ does not occur in T. 
As in the previous argument, the reduction step can be only the first one 
of Lemma 2, and the message can be either emit{m) or a copy of emit{d) 
unfolded from W. We detail only the case of emit{m)\ that of emit{d) is 
similar. Let Q' = £nv^± \ Rx{u} [Fm,x] \ emit{m)]. Applying Lemma 2, we 
obtain Q ^ Q'. If m is in u, then Lemma 4 (second part) yields 

Q' >: £nVx± [0] | emit{m)] 

so we can take R' = [0] and T = emit{m). Otherwise, let Q" = 

£nvx± [ Rx{u, m} [F^ ^ \ emit{m) ] . Applying Lemma 4 (first part), we ob- 
tain Q' ^ Q" . In turn, applying Lemma 5, we obtain 

Q" F £nVx± [Rx{u,m} [0] | x°{v) \ emit{m)] 



or 

Q” F £nVx± [Rx{u,m} [0] | emit{m)] 

depending on whether m is well- formed. So we can take R' = Rx{u,m\ [0] 
and T = x° (v) \ emit{m) or T = emit{m). □ 

7 Conclusion 

In this paper, we apply a process calculus and its theory to address correctness 
issues for basic cryptographic protocols. The protocols are intended to convey 
securely a single datum, and they are described with only an abstract view 
of networking and cryptographic algorithms. Nevertheless, these protocols are 
crucial for translations that map arbitrary join-calculus processes to lower-level 
processes that communicate over a public network. The security of the transla- 
tions depends on the correctness of the underlying protocols. We present a few 
sufficient conditions for correctness, and an example of a correct protocol. 

This paper is a partial review and a continuation of an ongoing research 
project. We are currently investigating extensions of the join-calculus with con- 
structs for authentication. The corresponding protocols implement authentica- 
tion using digital signatures. They address some of the inefficiencies of the pro- 
tocols discussed in this paper; for example, the presence of identity information 
removes the mutual anonymity of emitters and receivers, and enables some reuse 
of keys. So far, in this study of authentication, we have treated protocols on a 
case by case basis. However, we hope to obtain general correctness criteria anal- 
ogous to the ones of this paper. 

References 

1. Martin Abadi. Protection in programming-language translations. In Proceedings 
of the 25th International Colloquium on Automata, Languages and Programming, 
pages 868-883, July 1998. 128 



140 



Martin Abadi et al. 



2. Martin Abadi, Cedric Fournet, and Georges Gonthier. Secure implementation of 
channel abstractions. Manuscript, on the Web at http://join.inria.fr/; snb- 
snmes [3] and [4]. 127, 133, 135 

3. Martin Abadi, Cedric Fournet, and Georges Gonthier. Secure implementation of 
channel abstractions. In Proceedings of the Thirteenth Annual IEEE Symposium 
on Logic in Computer Science, pages 105-116, June 1998. 122, 123, 124, 130, 132, 
132, 140 

4. Martin Abadi, Gedric Fournet, and Georges Gonthier. Secure communications 
processing for distributed languages. In Proceedings of the 1999 IEEE Symposium 
on Seeurity and Privacy, pages 74-88, May 1999. 122, 124, 140 

5. Martin Abadi and Andrew D. Gordon. A calculus for cryptographic protocols: The 
spi calculus. Information and Computation, 148(1), January 1999. An extended 
version appeared as Digital Equipment Gorporation Systems Research Genter re- 
port No. 149, January 1998. 126, 128 

6. Mihir Bellare, Ran Ganetti, and Hugo Krawczyk. A modular approach to the 
design and analysis of authentication and key exchange protocols. In Proceedings 
of the 30th Annual ACM Symposium on Theory of Computing, pages 419-428, May 
1998. 124, 124 

7. Andrew D. Birrell. Secure communication using remote procedure calls. ACM 
Transactions on Computer Systems, 3(1):1-14, February 1985. 122 

8. Dominique Bolignano. Towards the formal verification of electronic commerce pro- 
tocols. In Proceedings of the 10th IEEE Computer Security Eoundations Workshop, 
pages 133-146, 1997. 122 

9. Rocco De Nicola and Matthew G. B. Hennessy. Testing equivalences for processes. 
Theoretical Computer Science, 34:83-133, 1984. 128 

10. Gedric Fournet. The Join-Caleulus: a Calculus for Distributed Mobile Program- 
ming. PhD thesis, Ecole Polytechnique, Palaiseau, November 1998. 124 

11. Gedric Fournet and Georges Gonthier. The reflexive chemical abstract machine 
and the join-calculus. In Proceedings of POPE ’96, pages 372-385. ACM, January 

1996. 122, 124 

12. Cedric Fournet, Georges Gonthier, Jean-Jacques Levy, Luc Maranget, and Didier 
Remy. A calculus of mobile agents. In Ugo Montanari and Vladimiro Sassone, edi- 
tors, Proceedings of the 7th International Conference on Concurrency Theory, vol- 
ume 1119 of Lecture Notes in Computer Science, pages 406-421. Springer- Verlag, 
August 1996. 124 

13. Cedric Fournet, Cosimo Laneve, Luc Maranget, and Didier Remy. Implicit typing a 
la ML for the join-calculus. In Antoni Mazurkiewicz and Jozef Winkowski, editors, 
Proeeedings of the 8th International Conference on Concurrency Theory, volume 
1243 of Leeture Notes in Computer Science, pages 196-212. Springer- Verlag, July 

1997. 124 

14. Cedric Fournet and Luc Maranget. The join-calculus language (version 1.03). 
Source distribution and documentation available from http://join.inria.fr/, 
June 1997. 124 

15. Alan O. Freier, Philip Karlton, and Paul C. Kocher. The SSL protocol: Version 
3.0. Available at http://home.netscape.com/eng/ssl3/draft302.txt, November 
1996. 122 

16. Dieter Gollmann. What do we mean by entity authentication? In Proceedings of 
the 1996 IEEE Symposium on Security and Privacy, pages 46-54, May 1996. 123 

17. D. Harkins and D. Carrel. RFC 2409: The Internet Key Exchange (IKE). Available 
at ftp://ftp.isi.edu/in-notes/rfc2409.txt, November 1998. 122 



A Top-Down Look at a Secure Message 



141 



18. Pat Lincoln, John Mitchell, Mark Mitchell, and Andre Scedrov. A probabilis- 
tic poly-time framework for protocol analysis. In Proceedings of the Fifth ACM 
Conference on Computer and Communications Security, pages 112-121, November 
1998. 124 

19. Gavin Lowe. Some new attacks upon security protocols. In Proceedings of the 10th 
IEEE Computer Security Foundations Workshop, 1996. 123 

20. Gavin Lowe. A hierarchy of authentication specifications. In Proceedings of the 
10th IEEE Computer Security Foundations Workshop, pages 31-43, 1997. 131 

21. Nancy Lynch. I/O automaton models and proofs of shared-key communications 
systems. In Proceedings of the 12th IEEE Computer Security Foundations Work- 
shop, pages 14-29, 1999. 124 

22. Gatherine Meadows. Analysis of the Internet Key Exchange protocol using the 
NRL protocol analyzer. In Proceedings of the 1999 IEEE Symposium on Security 
and Privacy, May 1999. 122 

23. Gatherine Meadows and Paul Syverson. A formal specification of requirements 
for payment transactions in the SET protocol. In Proceedings of the Financial 
Cryptography Conference, 1998. 122 

24. Alfred J. Menezes, Paul G. van Oorschot, and Scott A. Vanstone. Handbook of 
Applied Cryptography. CRC Press, 1996. 126 

25. Robin Milner. Communication and Concurrency. Prentice Hall International, 1989. 
128 

26. Robin Milner. Functions as processes. Mathematical Structures in Computer Sci- 
enee, 2:119-141, 1992. 125 

27. Robin Milner, Joachim Parrow, and David Walker. A calculus of mobile processes, 
parts I and II. Information and Computation, 100:1-40 and 41-77, September 
1992. 125 

28. J. G. Mitchell, V. Shmatikov, and U. Stern. Finite-state analysis of SSL 3.0. In 
7th USENIX Security Symposium, pages 201-216, 1998. 122 

29. Lawrence Paulson. Inductive analysis of the Internet Protocol TLS. ACM Trans- 
actions on Information and System Security, 2(3), August 1999. 122 

30. A. W. Roscoe. Intensional Specifications of Security Protocols. In Proceedings 
of the 9th IEEE Computer Security Foundations Workshop, pages 28-38. IEEE 
Gomputer Society Press, 1996. 132 

31. Davide Sangiorgi and Robin Milner. The problem of “weak bisimulation up to”. 
In W. R. Gleaveland, editor, Proceedings of CONCUR’92, volume 630 of Lecture 
Notes in Computer Science, pages 32-46. Springer- Verlag, 1992. 129 

32. David Wagner and Bruce Schneier. Analysis of the SSL 3.0 protocol. In Proceedings 
of the Second USENIX Workshop on Electronic Commerce Proceedings, pages 29- 
40, November 1996. A revised version is available at 
http://www.cs.berkeley.edu/~daw/me.html. 122, 124 

33. Tatu Ylonen. SSH — Secure login connections over the Internet. In Proceedings of 
the Sixth USENIX Security Symposium, pages 37-42, July 1996. 123 

34. Tatu Ylonen. Private communication. 1999. 123 



Explaining Updates by Minimal Sums* 



Jiirgen Dix^** and Karl Schlechta^ 

^ University of Maryland, College Park, MD 20752, USA 
dix@cs .umd.edu, 

http : //www .uni-koblenz . de/'dix 
^ Laboratoire d’Informatique de Marseille, CNRS ESA 6077, CMI, 
39 rue Joliot Curie, F-13453 Marseille Cedex 13, France 
ksSgyptis .univ-mrs . fr 
http : //protis .univ-mrs .f r/~ks 



Abstract. Human reasoning about developments of the world involves 
always an assumption of inertia. We discuss two approaches for formal- 
izing such an assumption, based on the concept of an explanation: (1) 
there is a general preference relation ^ given on the set of all expla- 
nations, (2) there is a notion of a distance dist betweeir models and 
explanatioirs are preferred if their sum of distances is minimal. Each dis- 
tance dist naturally induces a preference relation ^dist- We show exactly 
under which conditions the converse is true as well and therefore both 
approaches are equivalent modulo these conditions. Our main result is 
a general representation theorem in the spirit of Kraus, Lehmann and 
Magidor. 



1 Introduction 

Reasoning about developments or changing situations^ is an important prob- 
lem in Artificial Intelligence, as has been recognized very early. Much of human 
reasoning about these problems is based on the assumption that the world is rel- 
atively static. We will, for instance, hesitate to accept an explanation as plausible 
which involves many and unmotivated changes. 

Generally, there is always an assumption of inertia formalizing that certain 
properties tend to persist over time. Many nonmonotonic logics have been used 
to formalize persistency ([BDK97]). E.g. circumscriptive approaches try to min- 
imize change by circumscribing certain predicates (see [ZF93,KL95]). Default 
logics formalize persistency by stating special default rules ([HM87]). In logic 
programming, various versions of negation- as- failure have been defined to specify 
that fluents persist if other fluents can not be proved to hold (see [BD99,GL93]). 

In this paper we generalize a particular approach introduced 
in [Win88,Win89] and [Dal88]. The overall framework is propositional logic with 

* A full version of this paper, including detailed proofs, can be obtained under 
http : //www. cs .umd. edu/'dix/pub_ journals .html. 

** Currently on leave from University of Koblenz-Landau. 

^ The term situation is not to be confused with the same term in situation calculus. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 142—154, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



Explaining Updates by Minimal Sums 143 



respect to an underlying signature L. We denote by Mode the set of all propo- 
sitional models with respect to C. In this paper, however, we do not make use 
of the fact that Mode is induced by C. We abstract from this and view this set 
simply as a set of worlds denoted by W. The actual world can then be simply 
represented as an element of W. In most cases, however, we do not know the 
actual world. All we know is the current situation which is a set of worlds. 

Definition 1 (Situation S). A situation S is a set of worlds: S C W. As 
usual, S can he also viewed as a set Th of L-formulae: via Gddel’s completeness 
theorem, Th induces the set {A : Th |= A} C W . 

How the world actually evolves (while certain actions occur) will be described 
by a sequence of worlds. 

Definition 2 (Sequence cr, Explanation Expl). A sequence, denoted by a, 
{A\,... ,An), is a finite list of worlds: Ai € W. We also denote it by a := 
(ui, . . . , On). We say that the sequence a explains the change from situation S 
to situation S* , if, by definition, a\ & S and an G S* . 

The sequence a is also called an explanation (this use was suggested by Daniel 
Lehmann and the authors adopt it) for the change of S to S* . We denote by 
Expl(S', S'*) the set of all such explanations. By Expl we mean the set 

Expl(2^,2^) := U Expl(S,S*) 

s,s*cw 

of all explanations of all possible pairs (S, S*). 

Thus, a sequence cr describes the development of the world. We note that in 
general, the change from an initial situation S to another situation S* may be 
described by several different developments, even if both S and S* consist of just 
one world. Before going on with the technical definitions, some general comments 
about our approach are in order: 

1. We assume discrete time, and a sequence of observations: At time 1, we 
observed Si, at time 2 S 2 etc., where the Si are (usually not complete) 
theories corresponding to sets of worlds. 

2. An explanation of this sequence is a sequence of worlds cti, <J 2 , etc., with 
ai G Si. 

3. Thus, given a fixed sequence of observations, its explanations all have the 
same length. 

4. We are interested to single out the best or more plausible explanations, and 
do this by an assumption of a distance. A distance between worlds reflects 
the ’’cost” or ’’probability” of a change from one world to the other. 

5. Consequently, we consider sequences of worlds with a small sum of dis- 
tances between the individual worlds of this sequence as more probable or 
plausible than those whose sum of distances is big. Thus, such ’’minimal” 
explanations will be considered best explanations of a given sequence of 
observations. 




144 Jurgen Dix and Karl Schlechta 

6. A classical example of a distance between worlds is the Hamming distance, 
i.e. the number of propositional variables in which they differ. Other dis- 
tances are considered e.g. in [Sch95a]. 

7. Depending on our assumptions about the world, a number of approaches 
are possible. First, we can assume an abstract, arbitrary order between ex- 
planations, this idea was pursued in [Sch95b]. Second, we can assume that 
explanations with repetitions (i.e. the world has not changed at a certain 
moment) are better than those without repetitions. Thus, the sequence 
(w,w) is considered better than the sequence {w^w') — provided both ex- 
plain a given sequence of observations. This idea was pursued in [BLS98]. 
In the present paper, we push the idea of inertia further, minimizing the 
sum of changes involved in a sequence of worlds. 

Such sequences from S to S'*, or explanations, may represent different grades 
of plausibility: some sequences are less plausible than others. This leads to the 
notion of a plausible explanation illustrated in the next example. 

Example 1 (Plausible Explanations). 

Sequences that contain loops of the form (Ai, A 2 , Ai) and thus are unnec- 
cessary long, should not be considered as plausible explanations. A criterion of 
inertia is needed in order to rule out the unmotivated sequences and to define 
the set of plausible explanations. 

Of course, the most general approach is to just assume any preference relation 
between explanations. 

Definition 3 (Preference Relation -<l). A preference relation -< is any rela- 
tion on the set of all explanations Expl 

Expl X Expl. 

We call an explanation a ^-preferred, if by definition, a is minimal with 
respect to i. e. there is no other explanation a' a with u ^ a' . 

In [Sch95b,BLS98], the authors state general representation results for pref- 
erence relations between arbitrary sequences of models. 

A more intuitive approach to exclude such examples is due to [Win88,Win89]. 
The idea is to assume the notion of a distance between arbitrary worlds. 

Definition 4 (Distance dist). A distance dist is a function that associates to 
any two worlds a nonnegative rational number: 

dist W X W — > QJ; (Ai, A 2 ) dist(Ai, A 2 ) 

The idea of [Win88,Win89] was to measure the sum of all distances in a 
sequence and to consider those sequences as most plausible that correspond to 
minimal sums. 



Explaining Updates by Minimal Sums 145 



Definition 5 (Plausible Explanations Induced by dist: ^dist)- Let a dis- 
tance be given on W^. Let also two situations S, S* and two explanations for 
the change from S to S* a := (cri, . . . , cr„), a* := {a *, . . . , cr^) be given (i. e. a\ 
and are both contained in S and a„ and are both contained in ). 

We say that a is more plausible than a* , denoted by a ^dist if, by definition 

n—1 m—1 

dist(cTi, dist(cr*, cr(+i). (1) 

i=l i=l 

The most plausible explanations for the change from S to S* are those whose 
sum of distances is minimal. 

For a sequence a we denote by sum-dist(cr) the number d\st{(Ji, a 

Thus, if the notion of a distance is available, we can immediately define an 
induced preference relation -(dist- But is the converse also true? I.e. given an 
arbitrary preference relation ^ between possible explanations, does there exist a 
measure of distance dist on W such that -;=^dist?- 

The aim of this paper is to completely solve this question by characterizing 
those preference relations ^ which can be generated by a distance dist. To do this 
we use (an adaptation of) an old algorithm, going back to [Far02], to determine 
whether a set of inequalities of sums has a solution. 

The plan of the paper is as follows. After introducing some additional ter- 
minology in Section 2, we start in Section 3 with Proposition 1, stating that 
the preferred sequences are completely determined by the endpoints of certain 
intermediate sequences. We then give a precise formulation of our main theorem 
and end with a sketch of the proof of this result. This proof depends heavily on 
a abstract representation result shortly reviewed in the appendix. We conclude 
with Section 4 by citing related approaches. 

2 Terminology 

As already mentioned above, we assume discrete time, given by the integers IN. 
We also assume that we we have only incomplete information about the state of 
affairs at times ti, . . . , tn- This information is given by a sequence of situations 
(i. e. sets of models) 

Definition 6 A sequence S := (Ai,... , A^) is a finite list of 

situations: Si C W. Equivalently, we can view A as the product Llf^iSi. A 
sequence A represents our knowledge about the world at times ti,... ,tn- We 
denote by 7T^„(2'^) the union of all finite products of situations (sets of worlds) 
in W: 



77^„(2^) := IJ { n(L,Si \ 0 ^ Sf C W} 

neIN 

We denote by Ai. the restriction of S to the first io components. 
1*0 



146 Jurgen Dix and Karl Schlechta 



If there is a distance dist we can determine the set of those sequences a 
with (Ti € Si for which sum-dist(cr) (see Definition 5) is minimal. We call such 
sequences dist-preferred sequences. Analogously, given a preference relation ^ 
defined on W, we call sequences ^-preferred if there are no other sequences a 
with ai € Si that are smaller with respect to the relation 

Definition 7 (dist- and -l-Preferred Sequences). Let a sequence of situa- 
tions S := {Si , . . . , Sn) be given. We denote by Prefdist(A') (resp. Pre^^{S)) the 
set o/ dist -pre/erred (resp. < -preferred) sequences of worlds that are compatible 
with S: those sequences a satisfying 

1. (Ti £ Si (in particular a and S have the same length), 

2. sum-dist(cr) is minimal (resp. a is ^-preferred) among all sequences satis- 
fying (1). 

Note that Prefdist(lf) (resp. Pref^(A')^ are plausible explanations for the change 
of situation S\ to A„. 

We now associate to any sequence of situations S the set of endpoints of dist- 
(resp. -<-) preferred sequences compatible with S. 

Definition 8 (End^, Enddist). We define the following functions, depending on 
the underlying preference relation dist or 

Enddist:ITp„(2'^)^2^; 

S ^ {A£ W \ 3(7 £ Prefdist(27) s.t.“ (t„ = A} 

End^: 7Jp„(2^)^2^; 

S {A £ W \ 3a £ Pref^(A) s.t.“ cr„ = A} 

“ Note that E and cr have the same length, say n. 

The function Enddis 4 (-) for given dist has certain properties, which we will 
later use to completely characterize it. This means we will prove a theorem of 
the form 

If the function End^ : 77p„(2'^) ^2^ satisfies certain properties, 
then there is a distance dist on W^ such that End^(A') = Enddist(i7) 
for all S £ 

We would like to emphasize that our approach only assumes knowledge about 
Enddist(A), i. e. about the endpoints in A„. We do not assume anything about 
the endpoints of intermediate sequences of length less than n: Enddist(Z'| .) for 
i < n. 

Consequently, from Enddist(A) the set of all dist-preferred sequences can not 
be reconstructed. On the other hand we show in Proposition 1 (Section 3) that 
knowledge of the endpoints of intermediate sequences allows us to completely 
reconstruct Prefdist(2f). 

Although it is not needed to formulate our problem, the following extension 
of ^ from a relation between sequences <t to a relation between sequences S is 
very important in the proof of our main result. 



Explaining Updates by Minimal Sums 147 



Remark 1 (Extending ^dist to sequences E). The relations (resp. ^dist) 
can be straightforwardly extended to relations between sequences E\ 

E ^ S' if, by definition, a < a' for all u G Pref^(T'), cr' G Pref^(T’'), 

(resp. cr G Prefdist(T'), cr' G Prefdist(T'')). 

We assume that we have the information Enddjsi(T’) only about products E 
of sets of models, but not about arbitrary sets of sequences. Thus, 

Enddjst({a, a'} x {b,b'}) 

will be given, but, if a yf a' and b yf 6', then Enddjsi({(a, b), (o', 6')}) nec- 

essarily be defined — the sequences (a, 6'), (o', 6) are missing. On the other hand, 
we assume that we can reason about unions of sets of sequences, in particular if 
a union of products of sets is itself a product of sets, like 

{a, a'} X {&, b'} = ({a} x {b}) U ({a} x {&'}) U ({«'} x {&}) U ({a'} x {b'}). 



Definition 9 (Legal Sets of Sequences). We call a set of sequences (of sit- 
uations) legal, if this set is a product of sets. 

Thus, we can reason about arbitrary sets of sequences, but the world does not 
give us information about arbitrary, only about legal sets of sequences. It seems 
a natural hypothesis that the language of reasoning may be stronger than the 
language of observation. 

Obviously, the En are in a stronger position than the other intermediate Si, 
by definition of Enddjsi(T’). This corresponds to the fact that, considering a de- 
velopment into the future, we are probably most interested in the final outcome. 
Conversely, given a development from the past to the present, we might have 
most information about the present. 

There are, however, other directions of possible interest, and the reader will 
see how to adapt our conditions and proofs to the case which interests him. We 
examine in this paper the two extremes — all Enddjsi(T'| .) are known, and, only 
one Enddjsi(I7|^) is known. It should not be too difficult to modify our results 
and techniques accordingly. 



3 Updating by Minimal Sums 

Before formulating our main results, we need some additional notation: If cr is a 
sequence and a a point, aa will be the concatenation of cr with a. Consequently 

1. cr X xl will denote the set of all sequences era, a G A. 

2. S X A will denote the set of all sequences aa, a G E , a G A. Likewise E xa 
by abuse of notation. 




148 Jurgen Dix and Karl Schlechta 



The following lemma illustrates that, if we also know the preferences for suit- 
able intermediate observations, we can totally determine the preferred sequences. 
The meaning of ’’suitable” will become clear in the proof of the lemma. 

Proposition 1 (Prefdist(H') Induced By Intermediate Endd,st(i7| .)). Let E 

he a sequence in the sense of Definition 7. 

Prefdist(T') is reconstructihle from Enddist(T’'| .) for suitable E' with E[ C Ei. 

Proof. Fix i. 

Case 1: Enddist(T’| .) = {a^}. Then for all x G Enddist(Z’|. there is a 
preferred sequence containing (x, Oi) as a subsequence. Likewise for y G 

Enddist(T'|^ ^ ^). 

)| > 1. If |Enddist(i:|^ ^ ^)| = 1, we apply Case 1 to i - 1. 
Suppose |Enddis 4 (L'|^ ^)| > 1, and, for the same reason, |Enddist(L'|^ 

> 1. Fix Oi G Enddis4(i7| .), and consider E[i/{ai\], where Ei has been 
replaced by {oi}, i.e. 

E\i / {Uj}] =def E\ X ... X X ... X Eji. 

If Oi-i ^ Enddist(i7[z/{ai}]|^ ^), then there is no preferred sequence 

through (ai_i,ai) in E : Any such sequence a' through (ai_i,Oi) is al- 
ready in E[i/{aiy\ C E, and there is a better one in E[i/{ai\] C E. 
Suppose a*_i e £ndMst{S[i/ {oi}]^^ _ As Oi G Enddjst(Z'|^, there is 

a preferred sequence in E through a^. It is already in E[i/{ai\]. But in 
E[i/{ai}f there is one through all Oi-i G Enddist(^[*/{«J]|j _ % 

rankedness, all are preferred in E . So there is a preferred sequence in E 
through (ai-i,ai) for all a^-i G End^is^ (A[i/{ai}]|^ ^). The same argu- 

ment applies to i -l- 1. 

Suppose now cr, cr' G Prefdist(T'), and ai = a'. Let a = a^^Ecr*, where 
(T^ = CTi . . . (Ti, (T* = CTi+i . ■ .Gn - Likewise let a' = . Then also 

and G Prefdist(A'). For if not, then e.g. sum-dist((T'^) > sum-dist((r'^), 

as a' G Prefdist(L'), but then sum-dist(cr'^)-|-sum-dist((T*) > sum-dist(cr'^)-|- 
sum-dist((T*), contradicting a G Prefdist (A). 

Thus, any sequence constructed as follows: 

Oi G Enddisi 

a*_i e Enddjst(A[i/{aJ]|^ _ 
o-i-t-i G Enddzsi(A[i/'[cti}]|^ ^ 

belongs to Prefdist(A), and no others. □ 

Our next theorem is the main result of this paper. In Section 2 we noted that 
a preference relation ^ between worlds implies the existence of a function 

End^ :77y;„(2^)^2^. 



Case 2: |Enddist(A|^ 



Explaining Updates by Minimal Sums 149 



In general, the properties of this function depend on the underlying ^ relation. 
Indeed, if there is a distance dist then the induced function 

Enddist : ^ 2"^ 

has a lot of properties, due to this distance function. In our main theorem we 
want to completely characterize the general function End^(-) by suitable such 
properties. 

For the following, let therefore End^(-) be any function from 7Ty;„(2^) to 2^. 

We are looking for conditions on End^(-) which guarantee the existence of 
a distance with suitable order and addition on the values and which singles out 
End^(if) exactly for all legal S. If the relation is induced from a distance dist 
then the following holds: 

Criterion 10 (Important Conditions) 

(Cl) End^(27) C if IJn is the last component of S, 

(C2) 27^0 End^(27) ^0, 

(C3) End^((U S,) X B) C U End^(r, x B). 

There is one last condition that we need in order to prove our equivalence 
result: the (Loop) criterion. Before giving the technical details, we give some 
intuitive explanations: 

1. If a choice function can be represented by a distance, then the relation 
generated by it must be free of loops. So it must not be possible to conclude 
from the given information that a + b^c + d^a + b, otherwise, there 
would be no distances a, b, c, d and addition + representing it. Thus the 
Loop condition constrains the general preference relation As we put 
sufficiently many operations into the loop condition, the Farkas algorithm 
used in our proof will terminate and generate the representing distance. 

2. Note that the central conditions for representability in [LMS99] (condi- 
tions (I S'!), (I A2),(| A3), (^S*!), (*A2), (*A3)) are also essentially loop 
conditions. This is not surprising, as the problem there is similar to the 
one posed here: we try to embed a partial order into a total order, and this 
can only be done if the strict part of the partial order does not contain 
any loops. 

The proof of our main result uses an abstract representation theorem which 
is given in the appendix (to make the paper self-contaned) . One of the important 
ingredients of this representation theorem is a certain equivalence relation =. We 
define this relation on the set of all sequences of worlds as follows: 

a = a' if, by definition cr and a' have the same endpoint. 

1. If (T, a' have the same length, then [a, a'] := {a" : cr" G {tTi,cr'} for all *}. 
Note that [c, cr'] is a (legal) product of sets (of size < 2). Likewise, if E 
is a legal set of sequences, and a a sequence, both of same length, then 
[A, cr] := {cr' : a[ G EiU {crj}. 



150 Jurgen Dix and Karl Schlechta 



2 . 



If cr, a' are two sequences with cr„ = then aa' denotes their concate- 
nation ((Ti, . . . , cr„, (7^, . . . , cr^/). We write this also as cri x . . . x cr„ x (t^ x 



Definition 11 (Hamming Distance). If S is a set of sequences, a a sequence, 
both of the same length, then the Hamming distance hamm(i7, cr) will he the min- 
imum of the Hamming distances hamm(cr', cr), a' € S. (The Hamming distance 
between two sequences a, a' of equal length is the number ofi’s s.t. oi cr'). 

We have to state the following definition which contains the important con- 
ditions (i?i) (i?e) and (-1-1) (+5) used in the (Loop)-condition of Crite- 

rion 13. 

Definition 12 (Constructing ^ and ^). Originally, -< is only a relation 
between sequences u,a' . Here we extend -< to (1) a relation between arbitrary 
sums of sequences, and (2) to a relation between sequences E. 

< and Addition: Let us consider in an abstract setting arbitrary sums of 
distances of sequences a. I.e. we start with a set {oo-' : a^' is a subsequence 
of a, a is a sequence } and equip it with a binary function -|-. So we con- 
sider the set {flcr -I- ... -I- Oo-') : cr, ct' sequences }. In the following we will 
formulate conditions to constrain the interaction between -\- and (The 
terms aa,ar correspond to one sequence. When they are compared, they 
are of equal length. (= stands for ^ and > simultaneously.) 

(-hi) U(j -\- Qj. — ttj. -\- a,y , 

(~h 2) (o-CT J- Ur) ^77 ^ ^(7 i^a.^ -\- n^y), 

(~h 3) — Ofj’ r ((Ut — Ut-/) -e-7 (u,j -\- U 7 -) — (Uct^ 

( +4) UfT = Oct' — > (Ot- ^ ar' (Oct -I- a,- ^ Oct' -I- ar'), 

( +5) (tto- ^ tta' A Ur ^ Ur'} — > (Ocr J- ^ J- Ur')- 

A and Comparisons: Here we extend -< to a relation between sequences 
E. This is done by using the function End^. (In (R4), (R5) i ranges over 
some index set I .) 

(Rl) E X B A E X B' ifEnd^{E x {B U B')) (1 B ^ 0, 

(R2) E xB <E X B' if End^(i; x (R U B')) I^B' = %. 

For the left hand side: 

(R3) E X B A E' X B if E' C E, 

(R4) (Vi G I: E'xB A EiXB)ifEnd^{{E'U\JE,)xB) %\JEnd^{EiXB). 
(R5) (Vi G I: E'xB -< EiXB) ifC\End^{EiXB) % End^((r'UlJ ^i)x^)- 
(R6) //nig/V'j^0, then End^(A') = f|jg/ V"*. 

With the help of the notions introduced in the last definition, we define the 
( Loop ^-criterion : 



Explaining Updates by Minimal Sums 151 



Criterion 13 (Loop) The (smallest) relation defined by (R1)-(R6), (+l)-(+5) 
(see Definition 12) contains no loops involving -< (i.e. loops involving ^ are 
allowed, but no loops with the “strictly less” relation -<). In other words, the 
transitive closure of this relation is antisymmetric. 

Again, if the relation ^ is induced from a distance dist then the (Loop) criterion 
is satisfied, as can be easily checked. 

Theorem 1 (Representation Theorem). Let W, the set of explanations Expl 
and a relation Expl x Expl be given. Then the following are equivalent: 

1. There is a distance dist from W x W into an ordered abelian group, such 
that End^(A’) = Enddist(A) and Prefdi 5 t(if) = Pref^(A). 

2. The function End^ satisfies the conditions of Criteria 10 and 13. 

Proof (Sketch). Due to lack of space, we can provide the reader only with 
a small sketch of the proof. For the detailed proof, which is rather involved 
and uses the abstract representation result given in the appendix, we refer to 
URL : http : // www . cs . umd . edu/'^dix/ pub_j ournals . html. 

The direction from (1) to (2) is trivial. It remains to show that (2) implies (1). 
Assume a function End^ satisfying Criteria 10 and 13. Suppose the required 
distance function dist(i,j) between two neighbouring worlds in a sequence is 
modelled by the variable Xi^j . Then, for a sequence a = (1,2,..., m), the distance 
function would yield the sum 



A similar sum is built up for a sequence r. 

If cr ^ r, this leads to an inequality for the two sums. In this way, a system 
of inequalities is built up. We solve this system by using a modification of an 
algorithm communicated by S. Koppelberg, Berlin. The original algorithm seems 
to be due to [Far02]. The crucial loop criterion is used to ensure that a solution 
exists. It then remains to show that End^(A) = Enddist(F'). 

4 Conclusions and Acknowledgements 

One of the most distinguishing features of classical reasoning as applied in math- 
ematics and human reasoning as applied in everyday life, is the treatment of how 
the world changes over time. Humans use the fact, often induced by context, that 
certain properties persist over time. Frameworks for studying the formalization 
of this persistence are very important to develop reasoning calculi that can be ap- 
plied for realistic scenario. The many frameworks for belief revision — as studied 
in the last 15 years — all treat this problem. 

There have been proposed a lot of systems for dealing with this persistence 
problem. For example, depending on our assumptions about the world, a number 
of approaches are possible: 



152 Jurgen Dix and Karl Schlechta 



1. We can assume an abstract, arbitrary order between explanations, this 
idea was pursued in [Sch95b] and [Dal88]. 

2. We can assume that explanations with repetitions (i.e. the world has not 
changed at a certain moment) are better than those without repetitions. 
Thus, the sequence {w, w) is considered better than the sequence {w, w ') — 
provided both explain a given sequence of observations. This idea was 
pursued in [BLS98]. 

3. In the present paper, we push the idea of inertia further, minimizing the 
sum of changes involved in a sequence of worlds ([Win88,Win89]). 

We have shown in this paper the exact relationship between these approaches. 
We developed a general representation result in the spirit of [KLM90], Theo- 
rem 1, stating under exactly what conditions an arbitrary preference ordering is 
induced by a distance on the underlying models. 

We note in particular that although the main theorem can be stated without 
too much technical machinery, its proof requires quite a bit of technical notation. 
We also note our use of an old result of Farkas: this shows once again that math- 
ematical results considered quite exotic still find their applications in modern 
computer science. 

We owe special thanks to three anonymous referees for their careful reading 
and their numerous suggestions to improve this paper. In particular, one of 
the referees read the paper extremely carefully and his suggestions were greatly 
appreciated. 



References 



BD99. Gerhard Brewka and Jurgen Dix. Knowledge representation with extended 
logic programs. In D. Gabbay and F. Guenthner, editors. Handbook of Philo- 
sophical Logic, 2nd Edition, Volume 6, Methodologies, chapter 6. Reidel PubL, 
1999. 142 

BDK97. Gerd Brewka, Jurgen Dix, and Kurt Konolige. Nonmonotonic Reasoning: An 
Overview. GSLI Lecture Notes 73. GSLI Publications, Stanford, CA, 1997. 
142 



BLS98. Shai Berger, Daniel Lehmann, and Karl Schlechta. Preferred history seman- 
tics for iterated updates. Technical Report TR-98-11, Leibniz Genter for 
Research in Gomputer Science, Hebrew University, Givat Ram, Jerusalem 
91904, Israel, July 1998. to appear in Journal of Logic and Computation. 
144, 144, 152 

Dal88. M. Dalai. Investigations into a theory of knowledge bases revisions. In 
NCAI’88, pages 475-479. Morgan Kaufmann, 1988. 142, 152 
LMS99. Menachem Magidor, Daniel Lehmann and Karl Schlechta. Distance semantics 
for belief revision. Journal of Symbolic Logic, to appear, 1999. 149 
Far02. J. Farkas. Theorie der einfachen Ungleichungen. Crelles Journal fur die Reine 
und Angewandte Mathematik, 124:1-27, 1902. 145, 151 
GL93. Michael Gelfond and Vladimir Lifschitz. Representing Actions and Ghange 
by Logic Programs. Journal of Logic Programming, 17:301-322, 1993. 142 
HM87. S. Hanks and D. McDermott. Nonmonotonic Logic and Temporal Projection. 
Artificial Intelligence, 33:379-412, 1987. 142 



Explaining Updates by Minimal Sums 153 



KL95. G. N. Kartha and V. Lifschitz. A Simple Formalization of Actions Using 
Circumscription. In Chris S. Mellish, editor, l^th IJCAI, volume 2, pages 
1970-1975. Morgan Kaufmann, 1995. 142 

KLM90. Sarit Kraus, Daniel Lehmann, and Menachem Magidor. Nonmonotonic Rea- 
soning, Preferential Models and Cumulative Logics. Artificial Intelligence, 
44(l):167-207, 1990. 152 

Sch95a. Karl Schlechta. Logic, topology and integration. Journal of Automated Rea- 
soning, 14:353-381, 1995. 144 

Sch95b. Karl Schlechta. Preferential choice representation theorems for branching 
time structures. Journal of Logie and Computation, 5:783-800, 1995. 144, 

144, 152 

Win88. Marianne Winslett. Reasoning about action using a possible models approach. 
In Proc. 7th AAAI-88, pages 89-93, 1988. 142, 144, 144, 152 

Win89. Marianne Winslett. Sometimes updates are circumscription. In Proc. 11th 
Int. Joint Conf. on Artificial Intelligence (IJCAF89), pages 859-863, 1989. 
142, 144, 144, 152 

ZF93. Yan Zhang and Norman Y. Foo. Reasoning about persistence: A theory 
of actions. In Ruzena Bajcsy, editor, Proc. of the 13th Int. Joint Conf. on 
Artificial Intelligence (IJCAP93), pages 718-723. Morgan Kaufmann, 1993. 
142 



A An Abstract Representation Result 

Definition 14 (The Abstract Framework). Let the following be given: 

1. a nonempty universe U , an arbitrary set, 

2. a function fl : 7T^„(2^) — > 2^, 

3. an equivalence relation = on U (we write [[u]] for the equivalence class of 
u G U under =) such that [[m]] is finite for all u £ U , 

4 . two relations -< and -< onU with aCA. We denote by :<*, (resp. -<*) the 
transitive closure o/A (resp. ^). 

We also assume that the following holds for 12, -< and A.- 

(l7o) tUe assume two conditions: f2(A) C A, and ”A 0 implies f2{A) 0”, 

(I?i) ifa^A, [[a]] n n{A) = 0, [[6]] n n{A) ^ 0, 
then there is b' G [[6]] n A, b' -<* a, 

(122) ifa&A, [[a]] n L2{A) ^ 0, [[b]] n L2{A) ^ 0, 
then there is b' G [[6]] n A, b' A* a. 

In the first part (Proposition 2 (1)), we construct a ranked^ order <l on C/ 
by extending the relation ^ (and ^), and show that f2 = J7<|, where I2<| is the 
minimality operation induced by <1: f2^{X) := {x G X : ~^3x' G X x' <i a;}. 

Proposition 2 (Constructing Ranked Orders). 

< on t/ is called ranked, if, by definition, there exists a function rank : U ^ T from U 
to a strict total order (T, <t) such that u <i u' if and only if rank(u) <t rank(u'). 



2 



154 Jurgen Dix and Karl Schlechta 



( 1 ) If the relation ^ is free from cycles containing then can he extended 
to a ranked order <\ s.t. for all A QU and a G A: 

[[a]] n 12(A) = tb if and only if [[a]] n f2^(A) = 0. 

(2) If, in addition, U is a set of abstract distances d{-,-) over some space W , 

i.e. U = {d(x,y) : x,y G W} s.t., in addition to the conditions Qq, 1^2 

the following holds: 

(d\) yx,y G W: X ^ y implies d{x,x) d{x,y), 

(d 2 ) yx,y G W: d{x,x) ^ d{x,y) 

and the relation ^ is free from cycles containing then there is a to- 
tally ordered set (Z, <) with a minimal element 0 and a distance func- 
tion dist : IK xW ^ Z s.t. 

(a) 0 = dist(a;,a;) for any x G W, 

(b) d(u,v) -< d{x,y) dist(u,u) < dist(a;,?/), 
d{u,v) ^ d{x,y) dist(u,u) < dist(a:,?/), 

(c) for all A C U, a G A: [[a]] \yf2{A) = % if and only if [[a]] ni7<(Gl) = 0. 




A Foundation for Hybrid Knowledge Bases* 



James J. Lu^, Neil V. Murray^, and Erik Rosenthal^ 

^ Department of Computer Science, Bucknell University, 
Lewisburg, PA 17837. USA 
j amesluSbucknell . edu 

^ Department of Computer Science, State University of New York, 
Albany, NY 12222, 
nvmScs . albany . edu 

® Department of Mathematics, University of New Haven, 
West Haven, CT 06516, 
brodsky@charger . newhaven . edu 



Abstract. Hybrid knowledge bases (HKB’s) [11] were developed to pro- 
vide formal models for the mediation of data and knowledge bases [14,15]. 
They are based on Generalized Annotated Logic Programming (GAP) [7] 
and employ an inference mechanism, HKB-resolution, that is consider- 
ably simpler than those that have been proposed for GAP. The simplicity 
of HKB-resolution is explained in this paper by showing that it is a spe- 
cial case of 13- resolution, which was introduced in [9]. A generalization 
of U-resolution to lattices that are not ordinary is also explored. 

Keywords: inference, hybrid knowledge bases, deduction 



1 Introduction 

Hybrid knowledge bases (HKB’s) were proposed in [11] to model mediated sys- 
tems [14,15,16,17], which are knowledge base systems that reason across data- 
bases and knowledge sources with different structures. HKB’s combine several 
forms of automated reasoning: constraint logic programming [6], non-monotonic 
reasoning, and annotated logic programming. Several implementations of 
HKB’s - for example, the KOMET system [1,2] and the HERMES system [13] - 
have been realized and applied to mediation tasks. 

Hybrid knowledge bases grew out of the generalized annotated logic pro- 
gramming (GAP) of Kifer and Subrahmanian [7], which is a particular type of 
multiple-valued logic programming [4].^ HKB’s differ from GAP’s in one con- 
sequential way: The query processing mechanism developed for HKB’s, called 
HKB-resolution in [11], is considerably simpler than those that have been for- 
mulated for GAP’s (see, for example, [7], [8], and [12]). The disparity can be 

* This research was supported in part by the National Science Foundation under grants 
CCR-9731893, CCR-9404338 and CCR-9504349. 

^ A good survey of multiple-valued logics and their applications, including logic pro- 
gramming, can be found in [5]. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 155—167, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



156 



James J. Lu et al. 



summarized as follows: HKB-resolution is an efficient top-down query process- 
ing procedure similar to SLD-resolution for classical logic programming; no such 
general, efficient top-down procedures have been found for GAP’s. 

This paper provides an explanation of the mathematical simplicity of HKB- 
resolution. The key is that HKB assumes a truth value set that is an ordinary 
lattice^ so that, for the specific truth domain formulated in [11], HKB-resolution 
is a special case of G-resolution.^ The simplicity of HKB-resolution comes from 
underlying structure of the truth domain. It also has the search space pruning 
advantages inherent in L5-resolution. 

The basic ideas of HKB’s and GAP’s are described in Section 2, and O- 
resolution is described in Section 3. The relationship between HKB-resolution 
and G-resolution is developed in Section 4, and some computational properties of 
0-resolution that are inherited by HKB-resolution are investigated in Section 5. 
Section 5.2 contains an example from the KOMET system. Proofs are often 
omitted due to space considerations. 

2 Hybrid Knowledge Bases and Generalized Annotated 
Logic Programming 

A generalized annotated logic program consists of a first order language A and a 
complete lattice of truth values A under some ordering An annotated atom is 
an expression of the form A : /i where A is an atom in A, and /i is an annotation — 
a term over A. That is, /r is a constant, a variable, or a complex term built out 
of constants, variables, and function symbols. 

An annotated clause is an expression of the form 

A:/i ^ Hi :/3i, ...,H„:/3„,n > 0 

where A : fj, and each Bi : (3i are annotated atoms. The head of the annotated 
clause is A: yt, and {Hi : j3 \, ..., H„ : /?„} is the body of the clause. A generalized 
annotated logic program consists of a finite collection of annotated clauses.^ 

A hybrid knowledge base is a GAP whose truth domain is = [0, 1]"^;^ 
i.e., A is the set of all functions from the non-negative integers Af to the closed 
interval [0, 1]. Without loss of generality, we need only consider l-representable 
functions [11]; that is, functions / that are multiples of characteristic functions 
of finite sets: 

f{x) = u for a; € {ni, ..., nfe}; f{x) = 0 otherwise. 

We write such a function as the pair (v, {ni, ..., Uk})- Intuitively, u is a certainty 
measure for the time points ni, ...,rifc. The function (0.5, (0, 1,2}), for example, 
indicates a certainty of 0.5 for the time points 0,1,2, and a certainty of 0 for all 
other times. 

^ The symbol L5 is pronounced “mho” because it is an upside-down fl, which is used 
for the unit ohm; see Section 4 for the definitions. 

® GAP’s also admit constraints, but they need not be considered for the work presented 
here. 

^ There is also an assumption on the roles played by constraints in an HKB, and 
HKB’s allow non-monotonic negations. 



A Foundation for Hybrid Knowledge Bases 157 



To simplify the discussion, assume for the remainder of the paper that the 
annotations in GAP’s and HKB’s are variable free. 

An interpretation is a mapping from the set of ground atoms in A to elements 
of A. Given an interpretation /, the (variable-free) annotated atom A:/i is said 
to be satisfied by /, written / ^ A : /i, iff ^ A I {A). For a given GAP, the 
ordering A induces an ordering on the class of all interpretations as follows: 

h A h iff h{A) A hiA) for every ground atom A. 

The notion of satisfaction extends to annotated clauses in a straightforward 
way: / satisfies the clause A: p <— Bi: Pi, Bn : Pn if whenever I \= Bi\ Pi, 
1 < i < n, then / |= A\p. The next lemma is immediate. 

Lemma 1. li P < then A-.p\=A-.p. □ 

It follows that if p : /j, ^ Body is an annotated clause in a GAP P, and if 
<— p : /? is a query with /3 A then the clause and the query may be “resolved,” 
producing the query <— Body. 

This simple extension of ordinary resolution for classical logic program- 
ming [10] is called annotated resolution [7]. (By itself, it is incomplete.) To 
answer a query over an HKB, the inference rule HKB-resolution was defined 
in [11]. With truth domain A = [0,1]-^, suppose that C is the hybrid knowl- 
edge clause Ai : {u, t) ^ Bi\ (mi, ti), . . . , : (u„, t„), and that Q is the query 

<— Ai : (vi, si), . . . , Am : {vm,Sm) with u > v\. Then the HKB-resolvent of Q 
and C with respect to A\ is the query 

^ A\ . (^Vi, Si t), Bi .(rii,G),..., Bn . (rin ^ tn) 5 A2 . (U2 5^2),..., Am . (Vm : -^m) 7 

where si — t is the set difference of si and t. If si — t = 0, then the resulting 
query may be simplified by the removal of the atom Ai : (vi, si — t). 

Gonsider, for example, the following HKB: 

1) p:(0.5,{l,2})^ Bodyi 2) p: (0.7, {2, 3}) ^ Hodp 2 

The query ^ p : (0.5, {1, 2, 3}) may be HKB-resolved against the first clause, 
yielding the query <— p: (0.5, {3}), Bodyi. The new query may be HKB-resolved 
against the second clause to produce the next query, ^ p: (0.5, 0), Bodyi, Body 2 , 
which simplifies to <— Bodyi, Body 2 - Note that the same result is obtained if the 
first resolution is performed on the second clause. 



3 Extended O-Resolution 

We assume the reader to be familiar with the standard notions of linear and 
input deductions. In a logic programming setting, all deductions begin with a 
query, and all linear deductions are input deductions. Suppose P is a GAP and 
Q Ai : Pi , . . . , An : p„ is a query, where P \= Ai \ pi,l < i < n. An inference 
rule (or set of inference rules) S for GAP’s is said to be eompatihle with the linear 
restrietion if, for any such P and Q, there is a sequence Qq F 5 Qi F 5 ... F 5 Qm 
such that Qo = Q, Qm = n, and for 1 < i < n, Qi+i is the result of applying 



158 



James J. Lu et al. 



an inference rule in S to Qi and a clause in P or to Qi alone. (It is possible, for 
example, to apply an inference rule to a constraint in Qi.) 

Inference rules that are compatible with the linear restriction enjoy a num- 
ber of practical advantages, including ease of implementation and low memory 
requirements. It turns out that HKB-resolution is compatible with the linear 
restriction, and it is relatively efficient. However, to date, no effective inference 
mechanism has been found for GAP’s that is compatible with the linear restric- 
tion. In [II] it was speculated that the underlying cause is the restriction of 
HKB’s to ^-representable functions. As we shall see shortly, this is not quite the 
case. The underlying cause is the truth domain, which, fixed as [0, 1]"^, is an 
ordinary lattice. As a result, HKB-resolution is a special case of 13-resolution, 
defined later in this section. 

To define ordinary, let A be any lattice; if /r, p G A, let p)= {7 G A | 
Sup{ 7 , p} ^ p}, and let L5(p, p) = Inf M{p, p). Then A is said to be an ordinary 
lattice if 0(p, p) G M.{p, p). 

Consider first inference mechanisms in the more general GAP setting. A 
consequence of the definition of satisfaction is that, given an interpretation / and 
annotated atoms A\pi and A:p, 2 , I \= A:p,i and I \= A: p 2 I \= Sup{pi,p 2 }- 
In practical terms, given a query <— A : p, the information required to resolve 
A : p may be distributed across the heads of several clauses in the program. 
These clauses, therefore, need to be combined in some way. This is illustrated 
by the next example. 

Example. Let A and P be the lattice SIX and the GAP shown in Figure I. 
Let Q be the query ^ p:T. 

Po p:t ^ q:t 
Pi p:l{ ^ q:{ 

P2 g : t ^ 

PsqA^. 




Fig. 1. The Lattice SIX and GAP P 

It is clear that P entails p : T. Thus the query ^ p:T should be provable. 
It is also clear, however, that <— p : T cannot be proved via annotated resolution 
alone.® Both Pg and P\ must be used, but neither resolves with the query, and 
soundly combining them (such as with the reduction rule of annotated logic 
programming) is not linear. To maintain linearity, any inference must initially 
use the query ^ p : T. No single annotated clause has a “sufficiently high” 
annotation value to resolve p : T away. One remedy is to break p : T down to 
“simpler” queries, the combination of which is equivalent to the original query. 

Considering the lattice SIX, note that there are four possible ways to decom- 
pose <— p : T into a set of simpler but equivalent queries. They are. 



5 



Hence the incompleteness of annotated resolution alluded to earlier. 



A Foundation for Hybrid Knowledge Bases 159 



1. ^p:f,p:t 2. ^p:f,p:lt 3. ^p:lf,p:t 4. ^p:lf,p:lt 

This kind of example led to the inference rule decomposition, introduced 
in [9]. Let Q be the query <— Ai : p,i,...,Am '■ Pm, and suppose that Ai : p\ 
and Ai : p 2 are annotated atoms such that pi A Sup{pi,p 2 }- Then Ai : pi is 
said to decompose to {Ai ■,pi,Ai : P 2 ), and the decomposition of Q with respect 

to Ai . Pi is < A I - pi, Ai— 1 . pi— 1 , Ai . Pi, Ai . P2 , ■ pi+1 , A^n ■ pm • 

Soundness of decomposition was stated in [9]; the proof is presented here. 

Theorem 1. Suppose Q is an annotated query. Let Qd be a decomposition 
of Q, and let I be an interpretation. If I{Q) = true, then I{Qd) = true. 

Proof: Let Q be the query *— A \ p where A : p can be decomposed to 
A : pi,A : p 2 - Then the decomposed query Qr? is the query A \ pi, A \ p 2 - 
Since I{Q) = true, I{A : p) = false. Hence p I{A). It suffices to show 
that I{Qd) = true, i.e. p\ I {A) or p 2 I (A). Because p I (A) and 

p :< Sup{pi,p2}, we have Sup{pi,p2} 7^ H^)- To show pi I{A) or p 2 I{A), 
assume to the contrary that pi ^ I {A) and that p 2 I{-^)- Then I {A) is an 
upper bound of p\ and p 2 by definition, so Sup{pi,p 2 } ^ I{-^), and we obtain 
the necessary contradiction. □ 

The next theorem extends to GAP’s the corresponding result in [9] for an- 
notated logic programming. 

Theorem 2. The inference system consisting of annotated resolution and de- 
composition is complete and is compatible with the linear restriction for GAP’s. 

□ 

Gonsider again the previous example. From the query <— p: T, decomposition 
can be used to produce the query ^ p:lf,p:t. This query may be solved with 
two annotated resolution steps. 

It should be clear, however, that the use of decomposition can be woefully 
inefficient. Even in the simple example above, there are four choices for decom- 
posing the query <— p : T. Suppose the initial query had been decomposed to 
the query <— p:f,p:lt. Then annotated resolution would still not be applicable. 
In general, without additional information to guide the search process, a lot of 
backtracking would be necessary with any implementation. 

One way to improve decomposition is to modify it to be a binary inference 
rule in which the selection of one component is based on the existence of an 
appropriate head literal in a clause. If p and p are the annotations of the query 
and the selected head literal, respectively, another annotation 7 may be guessed 
such that p A Sup{7,p}. The inference rule extended G-resolution, described 
below, exploits this observation. 

In [9], 13-resolution was introduced, which is defined on a certain class of truth 
domains — the so called ordinary lattices. There are truth domains for GAP’s that 
are not ordinary; six is an example. The rule defined here, tentatively called ex- 
tended 13-resolution, applies to all GAP’s, and is the subject of ongoing research. 
As we shall see, in a setting employing an ordinary lattice as the truth domain. 



160 



James J. Lu et al. 



extended 0-resolution becomes 0-resolution. More importantly, soundness and 
completeness arguments carry through in the general case. 

To develop extended 0-resolution, consider P, a GAP over truth domain A, 
and suppose Q is the query <— Ai :/ri, ..., Am : fim- Let C be the annotated clause 
A: p ^ Body, where A = Ai. We would like to soundly infer from Q and C a 
new query 

< {A\ . pi, A^—i . P 2 — 1 , A. Body, AiJ,.\.piJ,.\,...,Am-pm] 

where 7 is an appropriate guess for the annotation of the new goal. As the 
above discussion indicates, any guess in which pi A Supjy, p} has the desired 
properties. So by simply placing that requirement on 7 in the inferred query, we 
have a sound inference. 

Of course, there may be many choices for 7 , although certainly fewer in 
general than with decomposition. Furthermore, the desirable properties of de- 
composition, namely soundness, completeness, and compatibility with the linear 
restriction, can be shown to carry over. However, a further improvement is pos- 
sible. 

Returning to the previous example, suppose we had chosen to infer a new 
query from the initial query <— p : T and the clause p:t <— q:t. The annotations T 
and t correspond to pi and p, respectively, in the inference discussed above. 
The choices left for 7 are f and If. Hence, we have reduced the number of 
potential backtracking steps from four, when using decomposition, to two, by 
these observations. 

Now, by Lemma 1, p : f ^ p : If, so the query <— p : f represents a more 
difficult goal to satisfy than ^ p:lf. Hence, the best choice for 7 is If since the 
resulting new query 

^p:lf,g:t 

is simpler than the alternative but achieves the same effect, namely solving the 
original query. 

This leads to the formal definition of extended L5-resolution. Let 
M{p,p) = {-i & A \ Sup{7, p} ^ p}, 

and suppose P is a GAP over an ordinary lattice Z\, Q is the query, <— A\ : 
pi, ..., Am : Pm, and C is the annotated clause A:p ^ Body, with Ai = A. Then 
the extended U-resolvent of Q and C with respect to Ai is the query 

^ (Al :pi, ..., Ai_i A:j,Body, A^+i:pi+i , . . . , Am ■ Pm ) 

where 7 is P-minimal in A4 {pi , p) . An extended U-deduction of a query from a 
GAP is an extended 13-proof ii 

<— Al : T, A2 : T, ..., A„ : T 



is the last clause in the L5-deduction. 

Soundness, completeness, and compatibility with the linear restriction still 
hold; the proof for G-resolution in [9] can be adapted to extended G-resolution. 



A Foundation for Hybrid Knowledge Bases 161 



Theorem 3. Extended L5-resolution is sound and complete with respect to GAP 
in general, and is compatible with the linear restriction. □ 

4 HKB-Resolution and O-Resolution 

In Section 3 we alluded to the fact that the advantages of HKB-resolution arise 
because [0, l]"^ is an ordinary lattice. The next section contains an explanation 
of these advantages, and the next theorem proves that [0, 1]-^ is ordinary. First 
recall the definition of ordinary. Let A be any lattice; if /i,p S A, let p)= 

{7 € I Sup{ 7 ,p} ^ p}, and let I3{p, p) = Inf M{p,,p). Then A is said to be 
an ordinary lattice if G(/i, p) G A4(p, p). To see why [0, 1]-^ is ordinary, given the 
functions p, p G [0, 1]^, define p* p as follows: For each x G Af, 

P * p{x) = 0 if p{x) < p{x) 

= p{x) otherwise. 

Lemma 2. li p,p G [0, 1]-^, then p* p = I3{p, p). 

Proof: It suffices to show that 

1. p * p G A4{p, p), and 

2 . p, * p A 7 for every 7 e p). 

To prove (1), let x GAf.lf p{x) > p{x), then p*p{x) = 0, so (Sup{p*p, p})(a:) = 
p{x) > p{x). If p{x) < p{x), p*p{x) = p{x). In either case, (Sup{p*p,p})(a:) > 
p{x), so (Sup{p * p, p}) ^ p; i.e., p * p G M{p, p). 

To prove (2), consider any 7 in M{p,p). To show that p * p A 7 , let 
X G JV. If p(x) > p{x) then p * p{x) = 0 < 7 ( 0 ;). Otherwise, since 7 G 
■^(M:P):(Sup{ 5 ,p})(a:) > p{x). Since p{x) < p{x),'j{x) > p{x) = p*p{x). 
In either case, 7 ( 0 ;) > p * p{x), so p * p A A- ^ 

Theorem 4 now follows easily from the lemma since p* p G [0, 1]^ whenever 
p,pG [ 0 , 1 ]-^. 

Theorem 4. The lattice [0, 1]^ is ordinary. □ 

It is easy to see that the theorem applies to the Z-representable functions in 
[0, 1]^: If (it, s) and (u, t) are Z-representable, then (m, s) * (u, t) = (u, s) if it > u, 
and (u, s) * {v, t) = (u, s — t) if u < u, so the set of Z-representable functions is 
closed under the operator *. Moreover, the HKB-resolvent of Ai : (u, s) and Ai : 
(v, t) is A \ : (m, s) * (v, t). In particular, this proves the next theorem. 

Theorem 5. HKB-resolution is precisely 0-resolution restricted to the truth 
domain [ 0 , 1 ]-^. □ 

The next corollary is a restatement of the completeness theorem from [11], 
but it follows directly from Theorems 3, 4, and 5. 



Corollary. HKB-resolution is sound and complete for hybrid knowledge bases 
and is compatible with the linear restriction. □ 



162 



James J. Lu et al. 



5 Computational Considerations 

There are many ordinary lattices other than [0, 1]^; examples include all finite 
distributive lattices [9], all powerset lattices, and many of the temporal lattices 
mentioned by Kifer and Subrahmanian [7], among others. The lattice SIX (Fig- 
ure 1) is not ordinary. Consider, for instance, t and It: 

L5(t, It) = Inf At(t, It) = Infjt, T, f. If} = T ^ At(t, It). 

There are pairs (/i, p) within SIX, and in many lattices that are not ordinary, that 
satisfy I5{p,,p) e M{p,p). For such pairs, extended 0-resolution is particularly 
attractive since, as we have seen for HKB’s, I3{p, p) represents the “best” choice 
for the 7 in the definition of extended 0-resolution. For pairs that do not satisfy 
I5{p,,p) gA4{p,p), extended 0-resolution is still applicable and still reduces the 
number of choices compared with decomposition. 

In addition to the aforementioned advantages, extended 0-resolution auto- 
matically incorporates a form of search space pruning. To see how, suppose we 
are given a query ^ A: p and a clause A:p^ Body, where p p. Application of 
extended 0 -resolution is useful only if /r 7 , where 7 is the annotation selected 
for the 0-resolvent. The reason is that, if /r ^ 7 , then the resulting query is 
subsumed immediately by the initial query. 

Now, if 7 is ^-minimal in M.{p,p), /r ^ 7 can never occur, and the only 
situation that warrants consideration is when p = 'y. More generally, we would 
like to avoid deductions that are “circular” with respect to annotations, as in 
the following situation. 



Qo ^ A:po, ... 

Qi ^ A:pi, ... 

Q n ^ A . Pq, ... 

Here, Qq is some initial query, and each query Qi in the deduction is the result of 
applying extended 0-resolution to the atom A:pi-\. Note that the annotation 
associated with A at step Qn is the same as that of the initial query. In general, 
loop detection techniques such as tabling (see, for instance, [3]) are required 
to prevent circular deductions from occurring. However, for ordinary lattices, 
extended 0 -resolution is 0 -resolution, and circular deductions with respect to 
annotations are avoided by not allowing p = I3{p,p). The following example 
assumes the lattice SIX. which, although not ordinary, illustrates the point. Let 

P = { p:f <- g:f , p:t <- q:t , p^, 

Cl C2 C3 C4 

Observe that P ^ p:lt. However, the query ^ p:lt is automatically prevented 
from 0-resolving with C\, since 0(lt,f) = It; i.e., the resolvent is immediately 
subsumed by the original query. Thus, C 2 and C 3 are acceptable candidates for 
extended 0-resolution with the query, but C\ is not. 



A Foundation for Hybrid Knowledge Bases 163 



In this example, C\ is not required for solving the query. It could be used 
if subsuming queries were allowed (see the deduction shown in Figure 2), but 
whether the query is solvable relies completely on whether it can be solved from 
P — {Cl}. Avoiding the use of Ci may represent considerable savings since the 
body of Cl could be arbitrarily large and/or could lead to dead-end deductions. 
Indeed, proof space reduction was a primary motivation for the introduction of 
O-resolution in [9]. Below we consider other deduction properties of extended 
0-resohition for both ordinary and non-ordinary lattices. The important result 
is, for ordinary lattices, the pruning of the search space occurs independently of 
the order in which clauses are selected for extended 0 -resolution. 

<— p : It original query 

<— p : It, g : f 0-resolution with Ci 

<— g:t,g:f 0-resolution with (72 

<— g : f 0-resolution with (7s 

□ 0-resolution with Ci 

Fig. 2 . A deduction allowing for subsuming queries 

5.1 Properties of LS-Proofs 

Perhaps the most important computational property of 0-resolution, proved 
in [9], is that, in comparison with the inference system AR proposed in [7], for 
each proof that can be obtained through AR, there is a 0-resolution proof that 
is no longer. Moreover, AR is not compatible with the linear restriction. The 
same result is valid for extended 0 -resolution, which is introduced here. 

Theorem 6. Let P be a GAP and Q a query. Suppose Par is a proof of Q 
from P with the inference system AR. Then there is a proof Po of Q from P 
with extended 0 -resolution that is no longer than Par- LI 

There is one simple sufficient (but not necessary) condition for detecting 
/r = Every i' less than p, is also less than p. More precisely, the downset 

of a; S Z\ is } a; = {y S A\y A a;}. Then p is said to be prime relative to p if p 
and p are not comparable and (ip — {mD p. 

Lemma 3 . Suppose A is ordinary and p is prime relative to p. Then p = I3{p, p). 

Proof: Since p S M{p, p) by definition, it suffices to show that p A ”f for each 
7 G Ai{p,p). Suppose not. Then for some 7 G A4(p, p), either j ^ p or p and 7 
are incomparable. 

If 7 A p, then 7 ^ p, since p is prime relative to p. It follows that 
Sup{7, p| = p, which is not comparable to p. But this contradicts the fact 
that 7 G M{p,p). 

Suppose now that p and 7 are incomparable; let 5 = Inf{7,p|. Since 
7,p € M{p,p), I5{p,p) A 5. Hence, since A is ordinary, S G M{p,p), so 
Supjp, i5}. On the other hand, since S A P and p is prime relative to 
p, 6 ^ p. But then Sup{ 5 , p| = p, which is not comparable to p. Contradiction. 

□ 



164 



James J. Lu et al. 



The condition of relative primeness provides a convenient test for avoiding 
useless applications of L5-resolution. The converse of the lemma does not hold, 
however. A counterexample is the lattice shown in Figure 3. The element ^ is 




Fig. 3. The lattice FIVE 



not prime relative to p and yet I3(pi, p)= p. Observe on the other hand that p 
is prime relative to p and Lemma 3 applies. Finding a necessary and sufficient 
condition for characterizing when p = 13 (p, p) remains an open problem. Such a 
condition would be useful in understanding the extent of the pruning that occurs 
with 0-resolution. 

Another issue in the analysis of proof space: How much is the pruning de- 
pendent on the order of clause selection? Consider the query <— A : /i and these 
clauses in a GAP 

Cl A:f3i ^ Bodyi, C 2 A: (32 ^ Body 2 , ... , C„ A\(3n ^ Bodyn ■ 

Consider Ci, and suppose p = 0(/i,/?i). An application of 0-resolution to Ci 
and the query <— A\p produces a subsumed query, so this step can and should be 
avoided. The question is, will choosing some other clause first change the status 
of Cl so that it can usefully participate in the deduction? The next theorem will 
help to answer this question. 



Theorem 7 (Order Independence). Suppose A is ordinary, and suppose 
P,/3i,/32 € A. Then ^ A4(0(/i, /J^), A). 

Proof: Let a\ = 0(/i,/?i) and 0:2 = C(/r,/? 2 ); the following equivalences prove 
the theorem. 



X e M{ai,f32) iff ai A Sup{x, (32} 

iff Sup{ai,/3i} ^ Sup{x,/3i,/?2} 
iff /r ^ Sup{x,/3i,/?2} 
iff Sup{x, /3i} G M{p, P 2 ) 
iff 02 A Sup{x,/3i} 
iffx G M{a2,/3i) 



□ 



A simple corollary of the theorem is that the result of applying O to a given p 
and two elements, f3i and /? 2 , is order independent. 

Corollary. Suppose A is ordinary. Then for any triple p, Pi and P 2 , 
I3{l3{p,pi),p2) = l3{l3ip,p2),Pi) . □ 



A Foundation for Hybrid Knowledge Bases 165 



To answer the question raised before the theorem, if C\ is not useful for the 
atom A:^, then it must be the case that = /i. Suppose that C 2 instead 

of Cl is chosen first as the clause on which to resolve, and that 0 (^,/? 2 ) = P- 
Then the corollary tells us that 0(p, /3i) must be p since 

y(p, /3i) = 0(0 (m, /32), /3i) = 0(0(m, /3i), /? 2 ) = P 2 ) = P- 

This implies that C\ will be pruned from the search whether it is chosen first or 
second. The argument can be generalized to the entire set of clauses Ci, ..., in 
the HKB. It follows that during a 0-deduction, a clause that is “unnecessary” in 
a proof will be pruned independently of the order in which clauses are selected. 

Theorem 8 (Clause Selection Independence). Suppose A : ^ is an atom 
contained in a query Q and C is an annotated clause whose head. A: /3, satisfies 
0(^,/3) = p. Then C will not participate in any 0-proof of the atom A'.pvciQ. 

□ 

Note that these results pertain to program clauses identifiable as unnecessary 
for a given query. A situation that seems similar but that is really quite different 
is the following. Given program clauses Ci, ..., C„ and query Q, several distinct 
subsets of the clauses may be minimally sufficient to solve the query. For example, 
suppose both {C\, ..., Cn- 2 } and {C 3 , ..., C„} are both sufficient to solve Q but 
neither has a proper subset that can. Initially, C\ is relevant to a solution of 
the query, and it will remain so as the clauses C 3 , ..., C „_2 are selected for 15- 
resolution. But if Cn-i is then selected, Ci may then be irrelevant to solving the 
query at hand. 

Also note that this is unrelated to the issue of how clauses can be intelligently 
selected to obtain shorter proofs. Consider the following GAP over the lattice 
SIX. 



Cl p:lt ^ Bodyi 
C 2 p:lf ^ Body2 

C 3 p : f ^ Bodyn 

Both C 2 and C 3 resolve away the goal in the query ^ p : t. However, the resulting 
queries — Body 2 and Body 3 — may admit very different proofs; indeed, one may 
not admit a proof at all. 

Remark. This kind of analysis of proof and search space properties of infer- 
ence engines is rarely available. The result here is strong validation that the 
deductive component of hybrid knowledge bases is the “right” rule of inference. 
Theorems 4 and 5, on the other hand, bring to focus the narrowness of HKB- 
resolution. The computational advantage of HKB-resolution is not limited to the 
domain [0, 1]-^ — it is applicable to the much richer class of ordinary lattices. The 
generalization of HKB-resolution, L5-resolution, therefore provides an attractive 
basis for the flexible implementation of HKB systems without sacrificing effi- 
ciency. 



166 



James J. Lu et al. 



5.2 KOMET 

The KOMET system [2,1], which employs the annotated resolution system of 
Kifer and Subrahmanian [7], has been applied to the mediation of web searches. 
An example of a simple mediator written in KOMET, including a detailed expla- 
nation, can be found at the web site: http://calmet-pc.ira.uka.de/komet . The 
truth domain is given by 

WWW = POWERSET(AltaVista,Excite,Yahoo,Lycos), 

which defines the powerset lattice of a four element set. 

It is intersting to note that KOMET does not restrict its truth domain to 
[0, 1]-^, not surprising in view of the remark at the end of the last section. More 
interesting to note is that in all of the examples on that web page, the truth 
domains are ordinary. 



References 

1. Calmet, J., Jekutsch, S., Kullmann, P., and Schii, J., A system for the integration of 
heterogeneous information sources. Proceedings of the Symposium on Methodologies 
for Intelligent Systems, 1997. 155, 166 

2. Calmet, J. and Kullmann, P., Meta web search with KOMET, Proceedings of the 
IJCAI-99 Workshop on Intelligent Information Integration, Stockholm, July, 1999. 
155, 166 

3. Chen, W. and Warren. D. S., Tabled Evaluation With Delaying for General Logic 
Programs. J.ACM, 43(1): 20-74, 1996. 162 

4. Fitting, M., Bilattices and the Semantics of Logic Programming, The Journal of 
Logic Programming, Elsevier Science Publishing Co, Inc., 11:91-116, 1991. 155 

5. Hahnle, R. and Escalada-Imaz, G., Deduction in many-valued logics: a survey, 
Mathware & Soft Computing, IV(2), 69-97, 1997. 155 

6. Jaffar, J. and Lassez, J.L., Constraint Logic Programming, Proceedings of the ACM 
Principles of Programming Languages, 111-119, 1987. 155 

7. Kifer, M., and Subrahmanian, V.S., Theory of generalized annotated logic pro- 
gramming and its applications, the J. of Logic Programming 12, 335-367, 1992. 
155, 155, 155, 157, 162, 163, 166 

8. Leach, S.M., and Lu, J.J., Query Processing in Annotated Logic Programming: 
Theory and Implementation, Journal of Intelligent Information Systems, 6(1):33- 
58, 1996. 155 

9. Leach, S.M., Lu, J.J., Murray, N.V., and Rosenthal, E., U-resolution: an inference 
for regular multiple-valued logics. Proceedings of the 6th European Workshop on 
Logics in AI, Springer, 1998. 155, 159, 159, 159, 159, 160, 162, 163, 163 

10. Lloyd, J.W., Foundations of Logic Programming, 2nd ed.. Springer, 1988. 157 

11. Lu, J.J., Nerode, A., and Subrahmanian, V.S., Hybrid Knowledge Bases, IEEE 
Transactions on Knowledge and Data Engineering, 8(5):773-785, 1996. 155, 155, 
155, 156, 156, 157, 158, 161 

12. Lu, J.J., Murray, N.V., and Rosenthal, E., A Framework for Automated Reasoning 
in Multiple- Valued Logics, J. of Automated Reasoning 21-.39-67 , 1998. 155 

13. Subrahmanian, V.S., et. al., HERMES: Heterogeneous Reasoning and Mediator 
System, University of Maryland Technical Report. Available at: 

http:/ /www.cs.umd.edu//projects/hermes/overview/paper/index.html 155 



A Foundation for Hybrid Knowledge Bases 167 



14. Wiederhold, G., Mediators in the Architecture of Future Information Systems, 
IEEE Computer, 38-49, 1992. 155, 155 

15. Wiederhold, G., Intelligent Integration of Information, Proceedings of the ACM 
SICMOD Conference on Management of Data, 434-437, 1993. 155, 155 

16. Wiederhold, G., Jajodia, S., and Litwin, W., Dealing with granularity of time in 
temporal databases. Proceedings of the Nordic Conference on Advanced Informa- 
tion Systems Engineering (R. Anderson et al. eds.), Springer, 124-140, 1991. 155 

17. Wiederhold, G., Jajodia, S., and Litwin, W., Integrating temporal data in a het- 
erogeneous environment. Temporal Databases, Benjamin Gummings, 1993. 155 



Hoare Logic for Mutual Recursion and Local 

Variables 



David von Oheimb* 



Technische Universitat Munchen 
D-80290 Munchen, Germany. 

WWW. in.tum.de/~oheimb/ 



Abstract. We present a (the first?) sound and relatively complete Hoare 
logic for a simple imperative programming language including mutually 
recursive procedures with call-by-value parameters as well as global and 
local variables. For such a language we formalize an operational and an 
axiomatic semantics of partial correctness and prove their equivalence. 
Global and local variables, including parameters, are handled in a rather 
straightforward way allowing for both dynamic and simple static scoping. 
For the completeness proof we employ the powerful MGF (Most General 
Formula) approach, introducing and comparing three variants for dealing 
with complications arising from mutual recursion. 

All this work is done using the theorem prover Isabelle/HOL, which en- 
sures a rigorous treatment of the subject and thus reliable results. The 
paper gives some new insights in the nature of Hoare logic, in particular 
motivates a stronger rule of consequence and a new flexible Gall rule. 

Keywords: axiomatic semantics, Hoare logic, mutual recursion, sound- 
ness, relative completeness, local variables, call-by-value parameters, 
Isabelle/HOL. 



1 Introduction 

Designing a good Hoare logic for imperative languages with mutually recursive 
procedures and local variables still is an active area of research. By ‘good’ we 
mean a provably sound and (relatively) complete calculus that is as simple as 
possible and thus easy to apply. There are several complications and pitfalls 
concerning the status of auxiliary variables, initialization of variables, scoping, 
parameter passing, and mutual recursion. As we will explain in the sequel, the 
work presented here provides theoretically interesting and practically useful so- 
lutions to these problems, and thus is good in the above sense. 

Classical verification systems dealing with these subjects - see [1] for an 
overview - typically neglect mutual recursion and have turned out to be un- 
sound, as mentioned e.g. by [4] and [5], or incomplete, or at least require several 
auxiliary rules with awkward syntactic side-conditions. Recent investigations 

* research funded by the DFG Project Bali, http://isabelle.in.tum.de/Bali/. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 168—180, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Hoare Logic for Mutual Recursion and Local Variables 



169 



tend to be much more precise, e.g. on the role of auxiliary variables, and even 
employ mechanical theorem provers to reliably prove soundness and complete- 
ness results. Here we emphasize the work of Kleymann^[10],[5] who suggests a 
Hoare logic of total correctness and proves it sound and relatively complete with 
the mechanical theorem prover LEGO. 

The work described in the present paper has been conducted in the context of 
Project Bali formalizing the semantics of Java and proving key properties like 
type soundness [7] formally within the theorem proving system Isabelle/HOL. 
Introducing an axiomatic semantics for a large subset of Java, we felt that there 
were several issues like mutual recursion and parameter passing where we could 
not resort to already established techniques. It turned out to be very practical 
and fruitful to perform our investigations in the reduced setting of a simple 
imperative programming language. In this respect we benefit from the pioneering 
work of Nipkow[6] that deals with the basic language (without procedures and 
local variables) within Isabelle/HOL. 

One could argue that mutual recursion can be reduced to the already estab- 
lished results on single recursion (e.g. of Kleymann) by program transformation. 
But this would require non-trivial syntactic manipulations, which would be dif- 
ficult to handle in a precise proof of soundness and unsuitable for practical 
program verification. Concerning local variables, the only fully formal treatment 
we know of, given by Kleymann, is a bit involved, so that one shrinks back 
from transferring it to procedure parameters. We are not aware of any previous 
work tackling even either of mutual recursion and procedure parameters whose 
soundness and (relative) completeness has been mechanically verified. 

Just a few words on Isabelle/HOL: This is the instantiation of the generic in- 
teractive theorem prover Isabelle[8] with Church’s version of Higher-Order Logic. 
The appearance of formulas on Isabelle/HOL is standard (e.g. ‘=>’ is the infix 
implication symbol associating to the right) except that logical equivalence is ex- 
pressed with the equality symbol. Predicates are functions with Boolean result, 
and function application is written in curried style, e.g. / x. Logical constants 
are declared by giving their name and type, such as c :: r. Basic definitions 
are written c = t. Types follow the syntax of ML; type abbreviations are intro- 
duced simply as equations. A free datatype is defined by listing its constructors 
together with their argument types, separated by ‘|’. Isabelle offers powerful veri- 
fication tools like natural deduction involving several variants of search, tableaux 
reasoning, general rewriting, and combinations thereof. 

We deliberately let the style of presentation of this paper be influenced by 
the fully formal treatment caused by using Isabelle/HOL, which should give an 
impression of its rigor. On the other hand, we abstract from technical details as 
much as possible in order to present our results in a generic way. 



^ formerly Schreiber. 



170 



David von Oheimb 



2 The IMPp Programming Language 

Winskel[ll] has introduced a simple imperative programming language for edu- 
cational purposes called IMP. We enriched it with procedures and local variables, 
calling the result IMPp. The syntax of its statements (“commands”) is 

com = SKIP I com; com \ vname : = aexp \ LOCAL loc:=aexp IN com 
I IF bexp THEN com ELSE com \ WHILE bexp DO com 
I Call pname \ vname := CALL pname(aexp) 

where the meanings of most of these constructs (Call being just an auxiliary 
one) is what you expect. The types aexp = state val and bexp = state bool 
represent arithmetic and Boolean expressions, which we do not further specify 
since we need only their (black-box) semantics. The type state has two com- 
ponents, namely the function spaces globs = gib —f val and locals = loc val 
representing the stores for global and local variables. The two kinds of variable 
names are combined into a free datatype vname = Gib gib \ Loc loc where Gib 
and Loc act as tags to distinguish them. The types gib and loc as well as the 
type of values val are left unspecified. The type of procedure names pname is 
also arbitrary, but is required to be finite, ^ as motivated in §5. 

We model the procedure declarations of a given program by a function 
body :: pname com mapping procedure names to the corresponding proce- 
dure bodies. Our meta-theoretic investigations do not require body to be spec- 
ified further. For simplicity, each procedure has exactly one parameter, which 
we model by a generic local variable Arg :: loc, and a result variable Res :: loc 
(where Res yf Arg) whose value is returned on procedure exit. These are merely 
syntactic restrictions avoiding immaterial but cumbersome details like explicit 
parameter declarations and return statements. 



2.1 Operational Semantics 



We define the semantics of IMPp straightforwardly by an evaluation-style oper- 
ational (“natural”) semantics. The evaluation (“execution”) of a statement c is 
described as a relation evalc :: (com x state x state) set between an initial state a 
and a final state ct', written (c,a ) — > a'. For lack of space and since the other 
inductive rules defining evalc are standard, we give only the relevant ones here: 



Local 



(c, ao[a (Jq/X ]) — > gi 

(LOCAL X : = a IN c, CTo) > CTi[cto(X)/X] 



CALL 



(Call pn, (setlocs uo newlocs)[a cro/Arg]) — > (Ti 
(X:=CALL pn{a), (7o) — > (setlocs u\ (getlocs o-o))[Af:=o-i(Res)] 

(body pn, cto) — > 



Call 



(Call pn, uq ) — > (Ti 



^ This is not a real restriction but a handy trick that avoids explicit well-formedness 
constraints implying that in any program there is only a finite number of procedures. 



Hoare Logic for Mutual Recursion and Local Variables 



171 



Note that local variables are initialized immediately when being created. The 
usual notion of procedure call is split into two parts, which will be very useful 
for the axiomatic semantics. The CALL statement replaces the local variables 
of the caller by the actual parameter of the called procedure as the only (by 
virtue of newlocs) local variable - thus implementing trivial static scoping - 
and restores them (except for assigning the result variable) after return. The 
Call statement is responsible for unfolding the procedure body only, thus im- 
plementing recursion. If it is invoked directly rather than via CALL, it implements 
dynamic scoping. 

The above definition makes use of a few auxiliary values and functions: 



newlocs 


: locals 






setlocs 


: state — 


^ locals — 


state 


getlocs 


: state — 


locals 




:=-l 


: state — 


* vname 


val 



shorthand: ct(Al) = getlocs a X 
state shorthand: (j\v/X] = cr[Loc X:=v] 



Our meta theory does not need define them further as it is independent of their 
meaning, newlocs is intended to yield the empty set of local variables, setlocs sets 
the local variables component of the state to a given set of variables, and getlocs 
returns the local variables of the state. The update function modifies 

the state at the given point with a new value, i.e. assigns to a (global or local) 
variable if it already exists, or otherwise allocates and initializes one. 

Properties of the evalc relation, for instance determinism, are typically proved 
via rule induction, i.e. induction on the depth of derivations. In contrast, struc- 
tural induction (on the syntax of statements) is unsuitable in most cases because 
rules like Call yield structural expansion rather than reduction. 



3 Axiomatic Semantics for IMPp 

Now that we have introduced the language IMPp, we can describe the core of 
our work, which is its axiomatic semantics (“Hoare logic”). 

3.1 Assertions and Hoare Triples 

Central to any axiomatic semantics is the notion of assertions, which describe 
properties of the program state before and after executing commands. Semanti- 
cally speaking, assertions are just predicates on the state. We adopt this abstract 
view (similarly to our semantic view of expressions) and thus avoid talking explic- 
itly on a syntactic level about terms and substitution and their interpretation. 
In other words, we do a “shallow embedding” of assertions in our (meta-)logic 
HOL. Thus, the issue of expressiveness of assertions disappears, and our notion 
of completeness automatically means completeness (basically) in the sense of 
Cook [2], i.e. completeness relative to the assumptions that all desired assertions 
can be expressed syntactically and all valid pure HOL formulas can be proved. 

Following Kleymann[5], we give the role of auxiliary variables the attention 
it deserves. Auxiliary variables, also known as “logical” variables (as opposed to 
program variables), are necessary to relate input and output, in particular to 



172 



David von Oheimb 



express invariance properties. For example, the proposition that a procedure P 
does not change the contents of a program variable X is formulated as the Hoare 
triple {X=Z} Call P {X=Z}, which should mean that whenever X has some 
value Z before calling P, after return it still has the same value. With this 
interpretation, Z serves as an auxiliary variable that is implicitly universally 
quantified. Early works on Hoare logic tended to view Z as a free^ variable, which 
gives the desired interpretation only if the triple occurs positively, and otherwise 
gives incorrect results. Viewing Z as an arbitrary (yet fixed) constant preserves 
correctness, but this approach suffers from incompleteness: having obtained a 
procedure specification like {X=Z} Call Quad {Y=Z*Z}, it is often necessary 
to exploit (i.e., specialize) it for different instantiations of Z, which is impossible 
if Z is essentially a constant. The classical way out is sets of substitution and 
adaptation rules involving intricate side-conditions on variable occurrences. A 
real solution would be explicit quantification like VZ. {P Z} c {Q Z}, but this 
changes the structure of Hoare triples and makes them more difficult to handle. 
Instead we prefer implicit quantification at the level of triple validity, given 
below, making assertions explicitly dependent not only on the state, but also on 
auxiliary variables. 

Which number of auxiliary variables of which types are required of course 
depends on the application. So we define the type of assertions with a parameter: 

a assn = a state bool 

where a may be instantiated as required. Thus the (pretty-printed) postcondition 
{Y=Z*Z} mentioned above fully formally reads as {XZ a. a(Y)=Z*Z\ where 
a = int. In general it is appropriate (and essential) to let a be the whole state, 
such that all program variables can be monitored when constructing an arbitrary 
relation between initial and final states. 

Built on the type a assn, we model a Hoare triple as the (degenerate) datatype 
a triple = {a assn} com {a assn}. It is valid wrt. partial correctness, written 
|={P}c{(5}, iff yZ a. P Z a=^\/a'. {c,a ) — > a' Q Z a'. Note the universal 
quantification on the auxiliary variable Z motivated above. This preliminary 
definition will be refined and extended to judgments with assumptions in §4.1. 

3.2 Rules not Dealing with Procedures 

The remainder of the current section is dedicated to the question of which Hoare- 
style rules should be given for the axiomatic semantics of IMPp. For the moment, 
simple derivation judgments with single triples, written h {P}c{Q}, suffice to 
capture everything but recursive procedures. So we take the usual rules, with 
two exceptions. 

, . y{P}c{XZa. Q Zia[a'{X)/X])} 

h {XZ a. a'=(T A P Z (cr[a a/X])} LOCAL X:=a IN c {Q} 

The Local rule adapts the pre- and postconditions reflecting the operational 
semantics directly. To facilitate this, it remembers the initial state in a' and 

® According to standard conventions, such variables are implicitly universally quanti- 
fied, i.e. r \- t Z is read as yZ. P\- t Z. Problems arise if Z occurs also in P. 



Hoare Logic for Mutual Recursion and Local Variables 



173 



extracts the value of X with (j'{X). (The meta variable a' could also be put 
as an auxiliary variable, but this would complicate matters unnecessarily.) As 
opposed to the rule given in [5] , this yields a straightforward handling of local 
variables. In particular, we do not require explicit mechanisms catering for static 
scoping because local variables are kept separate from global ones and are reset 
completely on procedure call (see §3.3 below). Another option, suggested in [1], 
would be to simply alpha rename X in c, but this would require a syntactic 
side-condition, namely that the new name does not already occur in P, Q and c, 
and an unpleasant modification of the program text. 



conseq 



yZ a. P Z a 3P' Q' . h {P'} c {Q'} A 

Verb (yz'. P' Z' a 



Q' Z' a') ^ Q Z a' 



h{P} c{Q} 



Our conseq rule is a strengthened version of the generalized rule of consequence 
discovered by Kleymann. As motivated in [10], it allows adapting the values of 
the auxiliary variables as required, due to the universal quantification in their 
interpretation discussed above. Additionally here, the triple in the premise only 
needs to be derivable if the precondition P holds, and both new pre- and postcon- 
ditions may depend on the auxiliary variables and the initial state. This allows 
not only other common structural rules to be derived (rather than asserted), like 

^ V b{P}c{Q} Gh{P'}c{g'} 

h {P}c{XZ a. True} h {XZ a.P Z aW P' Z (j^c{XZ a.Q Z aW Q' Z erj 



but also new structural rules, e.g. one facilitating the use of the Local and CALL 



rules: 



export 



Mg'. 'r{XZa. a' =a h P Z a} c{Q} 
h{P}c{Q} 



A typical example is the derivation (modulo predicate-logical steps) for the fact 
that a local variable does not affect outer local variables with the same name: 



T 



Local 



Vcr'. PhjXZa. True} c |A^ g-. a' {X) = {a\G' {X) /X\){X)} 



V(t'. P h {XZ a. a'=a A True} LOCAL X\=a IN c {XZ a. a' {X)=a{X){ 
Vcr'. P'r{XZa. a'=a A Z=a{X)\WCkl. X:=a IN ciA^cr. Z=a(X)} 
rh{XZa. Z=a{X}\ LOCAL A:=a IN c |A^a. Z=a{X}\ 



In a similar way, using some properties of getlocs and _[_:=_], a version of 
the Local rule corresponding to the classical rule leading to dynamic scope (cf. 
Rule 17 in [1]) can be derived: 



Vu P h {A^ tr. P ^ A cr(A) = a {a[v/X\)) c {XZ a. QZ {a[v/X\)} 

P h {P} LOCAL A: =a IN c { <5} 



3.3 Simple Procedure Rules 

When arriving at procedures, one is faced with the problem that in any practical 
calculus recursion cannot be handled trivially (i.e. by repeated unfolding). As a 
first step, we adopt the standard solution of introducing Hoare triples as assump- 
tions of judgments, which enables one to cope with recursive calls of an already 



174 



David von Oheimb 



unfolded procedure by appealing to a suitable assumption. Revising judgments 
(currently h _ a triple bool) to _ h _ :: a triple set — > a triple bool, we 
allow putting triples as assumptions into the contexts of the derivation. In order 
to reflect this revision, we have to add a context F to all judgments in the above 
rules. Next, we add three rules, the first of them being the well-known CallN 
{‘N’ stands for ‘nested’) rule that makes the specification of the currently un- 
folded procedure available as an assumption when verifying the procedure body. 
The second rule enables exploiting assumptions. 



CallN 



{{P} Call pn { (5}}UP b {P} body pn { Q} 
P h {P} Call pn { Q} 



asm 



ter 

Fht 



The third rule, CALL, is responsible for adapting the local variables, resembling 
the Local rule, though it adapts not only one variable. It resets all local variables 
and binds the parameter, and in the postcondition restores them (remembering 
the initial state in a') except for the one receiving the result: 



Ph{P} Call pn {\Z a. Q Z ((setlocs a (getlocs (t'))[X: =cr(Res)])} 
r\-{\Z a. a'=a t\ P Z ((setlocs a newlocs)[a cr/Arg])}X: =CALL pn{a){Q} 



This rule demonstrates how easy it is to include (call-by-value) procedure 
parameters, which have been left out by [5]. It is inspired by a similar rule 
from [9] , but differs in that it does not have to impose any syntactic restrictions 
on the variables occuring in the pre- and postconditions. 



3.4 Extended Procedure Rules 

As we will show in §5.2, the calculus as given up to here is already complete. Yet 
when using it to verify mutually recursive procedures with non-linear invocation 
structure, it becomes tedious: since the assumptions about recursive invocations 
can only be collected stepwise, often large parts of the proof have to be repeated 
for different invocation contexts. Consider the example of three procedures P, Q 
and R, where P calls Q and R, Q calls R, and R calls P and Q. Verifying them 
with the CallN rule yields the following, roughly abstracted, proof tree: 

hCall P JP,Q,R}hCa.ll Q |P, Q.Pj h Call R 

: {P,Q,R} H (body of P) : : {P,Q,R} H (body of Q) : 

{P,Q}hCallfl ~ {P,P}hCallP {P.P}h Call Q 

: {P,Q} H (body of Q) : : {P,P} H (body of R) : 

|P} h Call Q {Pj I- Call R 

: {P} I- (body of P) : 

0 h Call P 

The bodies of Q and R each are verified twice, which may be very redundant. 
This can be avoided by conducting a simultaneous rather than nested verification 
of all procedures involved. Verification condition generators such as [4] take this 



Hoare Logic for Mutual Recursion and Local Variables 



175 



idea to the extreme by verifying all procedures contained in a program simul- 
taneously, forcing the user to identify in advance a single specification for each 
procedure suitable to cover all invocation contexts. Our solution - given next - 
is more flexible because it permits, each time a call to a cluster of mutually re- 
cursive procedures is encountered, to verify simultaneously as many procedures 
as required (but not more) and to identify the necessary specifications locally. 

We extend the judgments further to _ H-_ :: a triple set a triple set bool 
{r\- t now becomes an abbreviation of OH-{t}) and replace the CallN rule by 

OU{{PjCall i{Qi} I ps}^{{fi}body i{Qi\ \ i&ps} pGps 
rh{Pp} Call p {Qp} 

When using this rule to verify a call of p, one can decide to verify simultaneously 
an arbitrary family of procedures where ps is the set of their names including p. 
Of course, we now need introduction rules for (finite) conjunctions of triples, 
whereas elimination rules like subset may be derived from the others. 

r\~t rVrts rVrts’ tsCts’ 

^ ^ rM rvr{t}uts rvr ts 



Exploiting the simultaneous Call rule, the proof tree of the above example 
collapses to 

PTQ.fljhCallP |P,Q,fl|hCall Q {P, Q,fl| h Call 

! {P,Q,R} h (bodies of P, Q and R) 

0 h Call P 



where no redundancy concerning procedure bodies remains. 

Though it is - strictly speaking - not necessary, we found the cut rule very 
useful in applications, as it helps to adapt the premises of judgments. A similar 
rule, complementing the subset rule, is the well-known weaken rule. It can be de- 
rived from all others by rule induction, or obtained as an immediate consequence 
of cut and a strengthened version of asm. 



cut 



r'Vrts rvc-r 
rw~ts 



weaken 



r'Vrts r c r 
rw~ts 



asm' 



ts c r 

rw~ts 



4 The Proof of Soundness 

This section motivates our actual definition of validity for Hoare triples, which 
is influenced by the proof of soundness outlined thereafter. 



4.1 Validity 

Validity involving assumptions, _ )j=- •• triple set a triple set bool, could 
be defined as = (VfGT. \=t) (VtGts. \=t). This would be reason- 

able, but when attempting to prove the Call rule which adds assumptions about 
recursive procedure calls, an inductive argument on the depth of these calls is 




176 



David von Oheimb 



needed. This could be achieved by syntactic manipulations that unfold proce- 
dure calls up to a given depth n, as done in [3] . We prefer a semantic approach 
instead, which is influenced by [9] and [5]. We define a variant of the operational 
semantics that includes a counter for the recursive depth of evaluations, rep- 
resented by the judgment _ :: com — > state — > nat — *■ state — > bool. 

The inductive rules using this new form are exactly the same as in §2.1, except 
for replacing — > by — n— > and replacing the Call rule by 

(body pn, ao)-n^ ai 

Call ; 

(CALL pn, fJo) — cri 

This refinement does not affect the semantics, i.e. the parameter n is a mere an- 
notation, stating that evaluation needs to be done only up to recursive depth n. 
The equivalence ((c,(t) — > a') = (3n. {c,a) — n—f a') can be shown by rule in- 
duction for each direction, where the ‘^=>’ direction requires the lemma 
(ci,CTi) — rq— > a[ A { 02 ,( 72 )— ri 2 ^ cr^ 3n. (ci,cti) — n— > a[ A { 02 , 072 )— n-^ a '2 
which in turn requires non-strictness: {c,a)—n—>- o' A n<m {c,o)—m-^ o' . 

According to the refined notion of statement execution, the notion of validity 
for single Hoare triples receives the recursive depth as an extra parameter: 
\=n:{P}c{Q} = ~iZ o. P Z o => Vcr'. {c,o) — n-^ o' Q Z o' 

This definition carries over to sets of triples by [|=n.ds = \/tG ts. \=n:t. 

Now we can define the final notion of validity including assumptions as 
= Vn. |j=n.'T 

This version is strong and detailed enough to perform induction on the recursive 
depth. On the other hand, when the set of assumptions is empty, it is equivalent 
to the version given above because the chain (0 H=fs) = (Vn. |j=n.-0 |j=n.-fs) = 

(Vn. ^n.ds) = (VfG ts. \=t) = ((Vf G 0. \=t) (VfG ts. ^f)) holds. 

4.2 Actual Soundness Proof 

With our new definition of validity we can express soundness as 0 h t %\=t. 
This is a direct instance of TH-fs P which can be shown by rule induc- 
tion on the derivation of the Hoare judgments and an auxiliary rule induction 
for the Loop rule. The Call rule is the only difficult case, where we benefit from 
the proof given in [3] suggesting a iemma that in our case reads as 
TU{{PJ Call i {Qi} \ iG ps} j|= {{PJ body i {Qi} \ iGps} 

|j=n.T Call i {ft} | iGps} 

Here is the point where the bounded recursive depth comes in, as we conduct the 
proof by induction on n. Doing this, we exploit the simple facts \=n+l:t^^ \=n:t, 
1=0.-{P} Call i {Q}, and (|=n-|-l.-{P} Call i {<5}) = {\=n:{P} body i{Q}). The 
CallN rule can of course be derived directly from the Call rule. 

As we can conclude from this section, the only interesting aspect of the proof 
of soundness is to find a suitable notion of validity capable of capturing an 
inductive argument on the recursive depth of procedure calls. Of course, due to 
the number of rules in the operational and axiomatic semantics, in the inductive 
proofs there are a lot of cases involving some amount of detail to be considered, 
for which the mechanical theorem prover is of great help. 



Hoare Logic for Mutual Recursion and Local Variables 



177 



5 Three Proofs of Completeness 

Much more challenging than the proof of soundness is the proof of completeness. 
Here we benefit heavily from the MGF approach promoted by [5] and others. 
We extend this approach, which was given for only a single recursive procedure, 
to several mutually recursive procedures. When dealing with mutual recursion 
some complications arise, which we overcome in three different ways, each with 
specific advantages and drawbacks. For lack of space we can describe only proof 
outlines and mention crucial lemmas. 

5.1 The MGF Approach 

For proving completeness of Hoare logics involving procedures, typically some 
variant of Most General Formula, MGF for short, is used. A MGF is a judgment 
F h MGT c where MGT takes a command c and returns a Most General Triple which 
describes the most general property of c, namely its operational semantics. The 
basic variant of a MGT for partial correctness is 

MGT c = {XZ aQ. Z = (Tq} c {XZ cti. {c,Z } — > ai} 

Its precondition stores the initial state (Jq in the auxiliary variable Z, which is 
consequently of type state here. Its postcondition claims that if the execution of 
command c terminates in some state ai , this is the same as the outcome of the 
operational semantics of c, starting also from aQ. 

Gommon to all variants of MGTs is that once the corresponding MGF has 
been proved, completeness almost immediately emerges by virtue of the rule of 
consequence. For instance, 0 h MGT c 0^{P}c{(5} 0h{P}c{(5} can be 

proved in a two-line Isabelle script applying the definition of validity. 

5.2 Version 1: Nested Structural Induction 

The outline, proposed by Martin Hofmann [3], of our first completeness proof 
employs two inductions (in very similar situations) on the structure of commands 
and a variant of MGT that is a bit more involved, namely 

MGT’ c = {XZ aQ. V(Ti. {CjUq ) — > a\ Z = a\\ c {XZ ai. Z = a\\ 

We refine the outline a little, first by factoring out structural induction into 
the MGT-lemma (Vp. P h MGT (Call p)) F h MGT c such that it is performed 
only once, and second by replacing MGT ’ by the simpler MGT . ^ 

The proof of 0 h MGT c reveals the crux of structural induction: when arriv- 
ing at unfolding procedure calls, the new subgoal gets structurally larger, such 
that we cannot appeal directly to any induction hypothesis. Assumptions in the 

For the case of the WHILE loop, we return to MGT ’ because there the auxiliary variable 
(To has to serve as the (invariant) final state of the iteration. Both variants are 
equivalent, where MGT’ entails MGT only if the language is deterministic (which is 
true for IMPp) and there are at least two different program states, which we simply 
assume since empty or singleton state spaces are of no interest anyway. 



178 



David von Oheimb 



judgments come to the rescue. Still, there remains a challenge: when using them 
naively, one is faced with the need to use structural induction nested as deep as 
the number of procedures in the program. This problem is overcome by resorting 
to an auxiliary induction on the number of procedures not yet considered, such 
that we strengthen our proof goal to T' = {MGT (Call p) \ True} 
yr. r cr' n < |T'| \r\ = |T'| — n Vc. r h MGT c where r' equals 

the set of all possible procedure calls. Its proof is by induction on n, exploiting 
the MGT-lemma twice. It heavily depends on T' being finite as otherwise calcu- 
lations on cardinality like |T| = |T'| — n would be meaningless. Now, 0 hMGT c 
is an immediate consequence (just specialize T to 0 and n to \r'\). 

Note that this version of completeness proof gets by with the CallN version 
of the Call rule (thus not requiring the rules empty and insert), but on the 
other hand needs to apply it in a nested way. 



5.3 Version 2: Simultaneous Structural Induction 

Our desire to circumvent the nesting problem of Version 1 has been the motiva- 
tion for inventing the Call rule as an extension of CallN, which allows handling 
procedures simultaneously. Version 2 is also by structural induction and makes 
use of the MGT-lemma, but by exploiting the power of the Call rule, it takes 
only a much simpler lemma, namely F C {MGT (body p) \ True} 

{MGT (Call p) I True}H-f. The latter is proved by induction on the size of F, so 
finiteness is vital also here. Gomparing Version 2 with Version 1, it requires a 
more advanced Call rule (and the two simple structural rules empty and insert), 
but handles mutual recursion more directly and thus clearly. 



5.4 Versiou 3: Rule luductiou 

Our third version of completeness proof takes the MGF approach to the extreme. 
It gave us surprising insights into the nature of Hoare logic, yet is probably of 
mainly theoretic interest because we could not avoid supporting it with two 
additional rules. Our intuition when discovering this approach has been that 
structural induction is not too nice, in particular when handling recursion, as 
the other versions show. Let us employ a more direct and powerful induction 
scheme: rule induction on the operational semantics. 

The pattern of rule induction requires that the inductively defined relation, 
evalc here, occurs negatively in the formula to be proved. Unfortunately, neither 
0^{P} c {Q} 0 h {P} c {Q} itself nor 0 h MGT c are of this pattern. Let us 

resort to Vctq cti. (c,ctq) — > ci 0 b {XZ a. a = ao}c{XZ a. a = ui} which 
is a kind of MGF property where the evalc relation has been pulled out of the 
assertion into the meta logic. From this formula we can easily show completeness 
applying our strong rule of consequence, but we have to require the (clearly 
admissible, yet non-derivable) extra rule 



diverg 



Cy {XZ a . . (c, a) 



a'}c{Q} 




Hoare Logic for Mutual Recursion and Local Variables 



179 



The above MGF property itself is directly amenable to the desired rule in- 
duction, which yields a surprisingly short proof. Unfortunately, it requires an 
unfolding variant of the Loop rule reflecting the operational semantics: 

r\- {XZ a. P Z a A b a}c{Q} T h {Q} WHILE & DO c {i?} 
r h {XZ a. P Z a Ab a} WHILE 6 DO c {i?} 

On the other hand, only a trivial variant of the Call rules (namely one without 
assumptions) and no auxiliary variables are needed here. 

Thus we can conclude that, in principle, the issues of assumptions and auxil- 
iary variables can be circumvented! Of course, this is only a theoretical point as 
in actual program verification one does not want to be faced with the operational 
semantics again, which was suitable for the meta-level completeness proof only. 

6 Conclusion 

In this paper we have described new approaches for dealing with mutual recur- 
sion, procedure parameters and local variables in a Hoare-style calculus. The 
calculus is powerful ” and also simple and convenient ~ enough to be used in 
actual program verification efforts. In particular, we have introduced a relatively 
simple handling of local variables, a convenient and flexible rule for simultane- 
ously verifying mutually recursive procedures, and a strong rule of consequence. 

All results have been achieved using the theorem prover Isabelle/HOL, which 
not only gives full confidence in their correctness, but also was a great aid in 
cleanly formalizing the theory and conveniently conducting the proofs. 

We have combined several existing techniques with new ideas, resulting in a 
lucid soundness proof and three variants of completeness proofs. Once discov- 
ered, they should be transferable to other logical systems and programming lan- 
guages with relative ease. The major current application is to an object-oriented 
language, namely the investigation of Java within Project Bali. 



Acknowledgments 

I thank Tobias Nipkow and Martin Hofmann for fruitful discussions on handling 
mutual recursion. The idea how to perform nested structural induction is due 
to Martin Hofmann. I also thank Manfred Broy, Tobias Nipkow, Leonor Prensa 
Nieto, Bernhard Reus, Francis Tang, Markus Wenzel and several anonymous 
referees for their comments on draft versions of this paper. 



References 

1. K. R. Apt. Ten years of Hoare logic: A survey - part I. ACM Trans, on Prog. 
Languages and Systems, 3:431-483, 1981. 168, 173, 173 

2. Stephen A. Cook. Soundness and completeness of an axiom system for program 
verification. SIAM Journal on Computing, 7(l):70-90, 1978. 171 



180 



David von Oheimb 



3. Martin Hofmann. Semantik und Verifikation. Lecture notes, in German. 

http://www.dcs.ed.ac.uk/home/mxh/teaching/marburg.ps.gz, 1997. 176, 176, 

177 

4. Peter V. Homeier and David F. Martin. Mechanical verification of mutually re- 
cursive procedures. In M. A. McRobbie and J. K. Slaney, editors. Proceedings of 
CADE-13, volume 1104 of LNAI, pages 201-215. Springer- Verlag, 1996. 168, 174 

5. Thomas Kleymann. Hoare logic and VDM: Machine-checked soundness and com- 
pleteness proofs. (Phd Thesis), ECS-LFCS-98-392, LFCS, 1998. 168, 169, 171, 

173, 174, 176, 177 

6. Tobias Nipkow. Winskel is (almost) right: Towards a mechanized semantics text- 
book. In V. Chandru and V. Vinay, editors, FST&TCS, volume 1180 of LNCS, 
pages 180-192. Springer- Verlag, 1996. 169 

7. David von Oheimb and Tobias Nipkow. Machine-checking the Java specification: 
Proving type-safety. In Jim Alves-Foss, editor. Formal Syntax and Semantics of 
Java, volume 1523 of LNCS. Springer- Verlag, 1999. 169 

8. Lawrence C. Paulson. Isabelle: A Generic Theorem Prover, volume 828 of LNCS. 
Springer- Verlag, 1994. Up-to-date description: http://isabelle.in.tum.de/. 169 

9. A. Poetzsch-Heffter and P. Muller. A programming logic for sequential Java. In 
S. D. Swierstra, editor. Programming Languages and Systems (ESOP ’99), volume 
1576 of LNCS, pages 162-176. Springer- Verlag, 1999. 174, 176 

10. Thomas Schreiber. Auxiliary variables and recursive procedures. In TAPSOFT’97, 
volume 1214 of LNCS, pages 697-711. Springer- Verlag, 1997. 169, 173 

11. Glynn Winskel. Formal Semantics of Programming Languages. MIT Press, 1993. 
170 



Explicit Substitutions and Programming 
Languages 



Jean-Jacques Levy and Luc Maranget 



INRIA - Rocquencourt, 
Jean-Jacques . Levy@inria.fr 
Luc.Maranget@inria.fr, 
http : //para. inria. fr/' {levy , maranget} 



Abstract. The A-calculus has been much used to study the theory of 
substitution in logical systems and programming languages. However, 
with explicit substitutions, it is possible to get finer properties with re- 
spect to gradual implementations of substitutions as effectively done in 
runtimes of programming languages. But the theory of explicit substi- 
tutions has some defects such as non-confluence or the non-termination 
of the typed case. In this paper, we stress on the sub-theory of weak 
substitutions, which is sufficient to analyze most of the properties of 
programming languages, and which preserves many of the nice theorems 
of the A-calculus. 



1 Introduction 

In the past ten years, several calculi of explicit substitutions have been proposed 
and studied with various motivations. In their original work, Curien [10] and 
Hardin [17] considered Categorical Combinators, as an algebraic definition of 
the syntax of the A-calculus. In [1,15], their calculus is simplified by using a 
two-sorted language, with terms and substitutions. The goal was there to study 
fundamental syntatic properties and applications to the design of runtime in- 
terpreters or fancy type-checkers. Unfortunately, the calculus in [1] is neither 
confluent (Church-Rosser property), nor strongly normalizable on the elemen- 
tary first-order typed subset [28]. But it is confluent on closed terms (of explicit 
substitutions), which are sufficient to represent all A-terms. Later several calculi 
were designed with full confluence [ 11 ], or with both properties by suppress- 
ing some of the operations of explicit substitutions such as the composition 
of substitutions[4,23,7]. Until very recently no fully expressive calculus existed 
with both properties of confluence and strong normalization. The termination 
problem, which is connected to cut elimination in linear logic [13], seemed more 
difficult; according to Martin-L 6 f or Mellies, it is due to an unlimited use of 
the 77 -expansion rule in the A-calculus, which is known as non terminating when 
coupled with /3-conversion. Recently, there has been a proposal for a new calcu- 
lus of explicit substitutions with both the confluence and strong normalization 
properties [12], but this calculus seems rather complex. Therefore, one may be 
skeptic about the usage of these theories. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 181—200, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



182 Jean-Jacques Levy and Luc Maranget 



Explicit substitutions may be used for a refined study of logical systems 
with bound variables, for instance, when one wants to axiomatize a-conversion, 
in higher-order theorem provers, or in recent process algebras such as Action 
Calculi [30]. In some of these systems, renaming of bound variables has to be 
defined very carefully. This was the case with the axiomatization of the type- 
checker for Cardelli’s Quest language, or with strategies for higher-order term 
matching [14]. 

But explicit substitutions were also introduced to have a formal theory of 
runtimes in programming languages, with implications for the CAM-machine 
(kernel of the Caml runtime) [9,22] or Krivine’s call-by-name machine. Usually, 
one then restricts attention to the theory of weak explicit substitutions. There is 
nothing really new with this remark, and much of the work at end of last decade 
was related to the correspondence between weak A-calculus and runtimes of func- 
tional languages. But to our knowledge, the theory of weak explicit substitutions 
has not been much studied, mainly because its properties look easy, but this is 
quite often a “folk” statement. 

In this paper, we present several weak theories, which may be consider as 
various exercises on weak A-calculus, and we try to look carefully to the fun- 
damental properties of their syntax. The claim is not that a useful theory has 
to be confluent or to preserve strong normalization in the typed case, but that 
keeping in mind these two properties could help for studying extra properties 
such as dependency analysis, shared evaluation, or stack allocation. 

In section 2, following Qagman and Bindley in [8] for Combinatory Logic, 
we define a confluent calculus of weak A-calculus, which is not a priori obvious 
since confluence of this weak theory often fails. This is achieved by allowing re- 
dexes under A-abstractions to be contracted if they do not contain occurrences 
of bound variables. In section 3, we consider a confluent calculus of weak explicit 
substitutions exactly corresponding to the calculus of previous section. In sec- 
tion 4, we study the corresponding reductions strategies. In section 5, we map 
these strategies to runtime interpreters, and show how to state properties, such 
as stack-allocation or graph-based sharing. In section 6, we consider the ministep 
semantics of weak calculus of explicit substitutions, and show its connection to 
more traditional presentations of explicit substitutions with de Bruijn’s indices. 
We conclude in section 7. 



2 Confluence of the Weak A-Calculus 

As usual, the set of A-terms is the minimum set of terms M, N defined by 

M,N ::= x \ MN \ Xx.M 

and the /3-reduction rule is 

(/3) {\x.M)N M[[x\iV]] 

where M[[a;\/V]] is recursively defined by 



Explicit Substitutions and Programming Languages 183 



a;[[a;\P]] = N 
y[[a;\P]] = y 

(MiV)[[a;\P]l = M[[x\P]] iV[[a;\P]l 
{Xy.M)[[x\Pj = Xy.M[[x\Pj 

In the last case, the substitution must not bind free variables in P, namely y 
must not be free in P. Usually in the A-calculus (as in traditional mathematics), 
equality is defined up to the renaming of bound variables (a-conversion) . Hence 
it is always possible to find y in (Xy.M) filling the previous condition. Sometimes 
the substitution inside A-abstractions is analogously defined as 

{Xy.M)[[x\P]\ = Ay'.M[[y\y']][[a;\P]l 

where y' is a variable not free in M and P. 

In the (strong) A-calculus, every context is active, since any sub-term may be 
reduced at any time. Formally, reduction is defined inductively by the following 
set of inference rules 



MN - 



In the weak A-calculus, the ^-rule is forbidden, and one cannot reduce inside 
A-abstractions. This corresponds to the natural behavior in programming lan- 
guages, since functions bodies cannot be evaluated without the actual values of 
their arguments. The ^-rule corresponds more to partial evaluation of a function, 
and can be considered only as a compiling transformation. 

But such a weak A-calculus trivially looses the Church-Rosser property (con- 
fluence), since when N ^ N' we have 



(Xx.Xy.M)N ^ {Xx.Xy.M)N' 



1 

(Ay.M[[a^\iVl) 



{Xy.M[ 



x\7V']]) 



Term (Ay.M[[a;\A'^]]) is in normal form and cannot be reduced, and the previous 
diagram does not commute. The problem has been known for a long time in 
combinatory logic [19], although often kept as a “folk theorem”. In [8], it is 
specifically stated, and shown as being relevant when translating the A-calculus 
into combinatory logic. One recovers confluence by adding the new inference rule 



N ^ N' 

(a) 

M[[a;\iV]| ^ M[[x\N']] 

However, this rule is not pleasant, since it axiomatizes more parallel reduction 
than single reduction steps. Let us first say that a variable x is linear in term M 



184 Jean-Jacques Levy and Luc Maranget 



iff there is a unique occurrence of the free variable x in M . Then we prefer the 
following variant to the cr-rule 

^ N ^ N' {x linear in M) 

^ ^ M [[a:\iV']] 

An alternative statement of {a') is possible with the context notation. Let a 
context C[ ] be a A-term with a missing sub-term, and let C[M] be the corre- 
sponding term when M is placed into the hole. Say that C\ ] binds M, when a 
free variable of M is bound in C\M], Clearly when C[ ] does not bind M, one has 
C[M] = C[x][[x\M]] for any fresh variable x not in C[M], Hence, the previous 
rule is equivalent to 



M M' (C\ 1 does not bind M) 

(a") — 

C[M] C[M'] 

So, inside A-abstractions, a redex (reductible expression) does not contain free 
variables bound in the outside context. Namely, a redex R in M is any sub-term 
R = {Xy.A)B such that M = for some free x linear in M. 

Now, by noticing that the cr'-rule encompasses the y, and ^-rules, it is possible 
to define the theory of weak A-calculus as the set of A-terms with the /3 and a' 
rules. As in [3], the transitive closure of — > is written So M — » iff M can 

reduce in several steps (maybe none) to N . 

Theorem 1. The weak X-caleulus is confluent. 



Proof: one can follow Tait-Martin-Lbf’s axiomatic method used to prove the 
Church-Rosser property. The main remark is that the problematic previous di- 
agram 



(Xx.Xy.M)N ^ (Xx.Xy.M)N' 



{Xy.M[ 



x\A^]]) »{Xy.M[x\N' 



now commutes since y is not free in N. 



□ 



In fact, the weak A-calculus enjoys simple syntactic properties. For instance, 
in the standard A-calculus, a rather complex theorem is the so-called finite de- 
velopment theorem, stating that the ordering in which a given set of redexes 
is contracted is not relevant and thus that there is a consistent definition for 
parallel reductions. This property is not easy to prove, since residuals of disjoint 
redexes may be nested. Take for instance term (Xx.Ix){Iy) with / = Xx.x. Then 

{Xx.IxflIy) ^ I{Iy) 



It is not the case in the weak A-calculus since Ix in (Xx.Ix) contains the bound 
variable x and is not a redex in the weak A-calculus. In order to formally define 



Explicit Substitutions and Programming Languages 185 



residuals of redexes, we will use two ways. The first one uses named redexes by 
extending the set of A-terms as follows 

M, N ■.:=x\ MN \ Xx.M \ {\x.M)^N 

where a is taken in a given alphabet of names. The calculus is defined by the 
same j3 and a' rules with the addition of a new /3'-rule for contraction of named 
redexes 

(/?') {Xx.MYN M[[a:\7V]] 
and substitution is extended by the following equation 

((Aj/.M)“iV)[[a;\P]] = {Xy.M[[x\Pjr 

In order to track redexes along reductions, redexes may be named and the resid- 
uals are redexes with the same names in the term after reduction. Notice that 
(named) redexes must not contain external bound variables as implied by {a') 
and it has to be shown that residuals of redexes are still redexes. 

The second way is based on the substitution notation. Let i? be a redex in M 
and let M M' by contraction of R. Then M = M' = 

with X linear in Mi and R ^ R' . Let S be another redex in M. Then M = 
fVi[[?/\S']] with y linear in Ni. Then the residuals of S in M' are defined by case 
analysis: 

— If S' contains R, then Mi = fVi[[y\Si]], S = Si[[a;\i?]] where Si is a re- 
dex and X linear in Si. (Clearly, Si is a redex in Mi since x not bound 
in Ml cannot be bound in Si). Then M = fVi[[?/\Si]][[a;\i?]] and M' = 
Ni[[y\Si]][[a;\i?']]. So M' = A^i[[y\Si[[a;\i?']]]]. The residual of S is this unique 
redex S' = Si [[a;\i?']]. It is indeed a redex since M' = Ni[[a:\i?']]. Notice too 
that S ^ S'. 

— If S does not contain R. Then 7Vi = A^ 2 [[a^\Pil with x linear in N 2 , and 
R = i?i[[y\S]] where y may not appear in Ri. Then Ni = iV 2 [[a;\Pi]] — *■ 
iV 2 [[x\i?i]] = N[ with Pi ^ R[. And M = fVi[[y\S]] ^ fV([[y\S]] = M' . The 
residuals of S are all the S-redexes appearing at each occurrence of y in N[. 
Notice then that all residuals are equal to S. Again no free variable in S may 
be bound in M' since M' = N([[y\S]]. 

— If Ml = Ni and x = y. Then S and R coincide and S has no residual in M' . 

This second definition shows that residuals of redexes are still redexes in the 
weak A-calculus. We also remark that a residual R' of any redex P by a given 
many-step reduction is always such that P — » P', which is not true in the normal 
A-calculus where one may substitute the free variables of a redex and give much 
more reduction power to residuals. Finally, it remains to show that the two ways 
of defining redexes give identical definitions. We leave this in exercise to the 
reader. 



Proposition 1. Residuals of disjoint redexes are disjoint redexes. 




186 Jean-Jacques Levy and Luc Maranget 



Proof: Let R and S be two disjoint redexes in M. The only way to get a 

residual S' of S to go through a residual R' of R by the contraction of a redex 
{Xx.A)B in M is that A = C[R] and cc is a free variable of R substituted by 
B = C"[5']. But then R is no longer a redex of the weak A-calculus, since it 
contains the free variable x bound in the external context. □ 

Let IF be a set of redexes in M. A development of R is any maximal reduction 
only contracting residuals of redexes of T . 

Proposition 2. Developments of sets of redexes are always finite. 

Proof: To each R in IF, we associate its maximal nesting level n{R), where 
n{R) = max{l + n{S) \ R directly contains S G T}. So n{R) = 0 if i? contains 
no redex in T . Consider the multiset o'(lF) = {n{R) \ R G IF} with the natural 
multiset ordering. Then each step of the development decreases this multiset, 
since by previous proposition no new nesting appears in the residuals T' of T . 
However, a redex contained in the contracted redex may have several copies as 
residuals, but then its nesting level is less that the one of the contracted redex 
which disappears. As the multiset ordering is well-founded, every development 
is finite. □ 

The rest of the finite development theorem (i.e. confluence and consistency of 
residuals) is proved as in the standard A-calculus. A simple way is by use of the 
labeled weak A-calculus defined above, and by showing that it is confluent and 
strongly normalizable when contracting only with the /J'-mle. (The strong nor- 
malization proof follows exactly the previous outline used for finite developments 
by considering the nesting levels of fi' -redexes) . 

Proposition 3. Let M M' by contraction of redex R, and let S' be a redex 
of the weak X-calculus, residual of S in M not inside R. Then S is also a redex 
in the weak X-calculus. 

Proof: Let R be the redex contracted in M ^ M' . Then M = 
and M' = where R = {XzA)B and R' = A[[z\H]]. We work by 

contradiction, and suppose S is not a redex. Then a free variable y of S' is bound 
in M . We have several cases: 

— R' and S' are disjoint. Then S' = S and if y in S is bound in M, clearly it 
is same in M' . 

— S contains R' . Then S = Si[[a;\i?]] ^ Si[[a;\i?']] = S'. If the free y-variable 
in M is bound in M. Then either y is in Si and remains in S', which in- 
volves that S' is not a redex of the weak A-calculus. Either y is in R, which 
contradicts the fact that R is also a redex of the weak A-calculus. 



□ 

The statement of the previous property may seem over-complicated, but some 
care is needed since for instance when / = Xx.x, an easy counterexample is 
(Xz.Iz)a la. 

This proposition allows now to state another interesting theorem of the weak 
A-calculus, namely Curry’s standardization theorem. A standard reduction is 




Explicit Substitutions and Programming Languages 187 



usually defined as reduction contracting redexes in an outside-in and left-to- 
right way. Precisely a reduction of the form 



is standard when for all i and j such that i < j, the Rj-redex contracted at 
step j in Mj-i is not a residual of a redex internal to or to the left of the Ri~ 
redex contracted at step i in Mi_i. We write M —» N for the existence of a 

standard reduction from M to N. Notice that the leftmost outermost reduction 
is a standard reduction (in the usual A-calculus), but standard reductions may 
be more general. 

Theorem 2. // M — » M', then M M' . 

Proof: One follows the proof scheme in [20] or checks the axioms of [29]. The 
basic step of the proof follows from proposition 3. Take M ^ iV — > P by 
contracting R in M and S in N . Suppose R and S are not in the standard 
ordering. Then S is residual of a redex S' in M to the left of or outside R. By 
proposition 3, we know that S' is a redex of the weak A-calculus and we may 
contract it getting N' . By the finite development theorem, we converge to P by 
a finite development of the residuals of i? in fV'. □ 

3 Weak Explicit Substitutions 

Although its language is minimal, some properties of the weak A-calculus may 
look non intuitive and require at least a careful analysis. However it is close 
to a calculus of closures, which is the language of interpreters for functional 
languages. We now introduce such a calculus of closures, named the calculus of 
weak explicit substitutions, and study its connection to the weak A-calculus. 

The language contains terms which are reductible, and programs which are 
constant. The new terms with respect to the weak A-calculus are closures rep- 
resented as a A-abstraction coupled with a substitution. We use same names 
for variables in terms and programs, and will precise their kind when necessary. 
Programs correspond to all A-terms. Substitutions are functions from variables 
to terms represented by their (finite) graph. Notice that the domain of a substi- 
tution is always finite 

M , N ::= term 



M = Mo ^ Ml ^ ...Mn = N (n > 0) 



X 



variable 



I MN 
I (Aa;.P)[s] 



application 

closure 



P.Q 



s 



X I PQ I Xx.P 

(Xi, Ml), (x 2,M2), ■ ■ ■ {Xn, Mn) 



programs 
Xi distinct (n > 0) 



with domain{s) = {xi,X 2 , ■ ■ ■ Xn} and s(a;i) = Mi, 5 ( 2 : 2 ) = M 2 , . . . s{xn) = Mn- 
Notice that s is explicitely written as a set of pairs representing the graph of 



188 Jean-Jacques Levy and Luc Maranget 



function s. Thus the ordering in which this graph is written does not matter. 
Now substitutions are extended to every program in the usual way. 

PQM = ^[[s]]Q[[s]] 

(Aa;.P)[[sj = (Aa:.P)[s] 

a;[[s]] = s(a;) if a; G domain{s) 
a;[[s]] = X otherwise 

Thus substitutions are applied to every subexpression of a program, except for 
lambda-abstraction where it stays in the substitution part of a closure. A sub- 
stitution is modified by forcing one of its value 

s[x\N]{y) = N iiy = x 

= s{y) otherwise 

The dynamics of weak explicit substitutions can now be defined by the following 
/3-rule and inference rules for active contexts 

(/3) (Ax.P)[s] /V^P[[s[a.\/V]]] 

s' 

{Xx.P)[s\ ^ (Ax.P)[s'J 

, , M ^ M' , , N^N' 

MN M'N MN MN' 

, . s(x) —!■ M' s' = s[x\M'] 

(a) 7 i— ! - 

The calculus of weak explicit substitutions is nearly a first-order orthogonal 
term rewriting system. It manipulates sets for substitutions, which is not allowed 
in a standard rewriting system where only terms in a free algebra are considered. 
It is also defined with a scheme of axioms (the /3-rule) with respect to programs. 
Anyhow, the calculus has the good properties of orthogonal systems. 

Theorem 3. The calculus of weak explicit substitutions is confluent. 

Proof: One proof uses the axioms in [29], another more direct proof may again 
follow the Tait-Martin-Lof’s axiomatic technique. The only interesting cases are 
the two commuting diagrams 

{Xx.P)[s]N ^ (Aa;.P)[s']Af {Xx.P)[s]N ^ (As.P)[s]Af' 

P[[s[a:\iV] ]] » P[[s'[a;\A^] ]] P[[s[a;\A^] ]] » P[[s[x\iV'] ]] 

when s — > s' and N ^ N' . We need then three lemmas showing that one has 
s[a;\A^] — ^ s'[cc\A^], P[[s]] — » P[[s']j and s[a;\3V] ^ s[a;\A^']. □ 



Explicit Substitutions and Programming Languages 189 



The standardization theorem also holds in weak explicit substitutions. The 
main difficulty is in its statement since residuals have to be defined, which can 
be done. The definition can again be done by considering named redexes, and 
extending the set of terms and reductions by 

M, N ::= term 

I . . . as previously 

I (Ax.P)“[s] iV named redex 

i/3') (Aa;.P)“[s] N P[[s[a;\lV] ]] 

Proposition 1 can also be shown, and residuals of disjoint redexes keep disjoint. 
Take for instance 

M = (Ax.Jx)[](J[]/[]) ^ /[(x, /[]/[])](/[]/[]) = N 

when / = Ax.x. The external redex in N is not a residual of any redex in M . 
This also greatly simplifies the proof of the finite development theorem (see 
proposition 2 ). 

We now consider translations of weak explicit substitutions into weak A- 
calculus and vice-versa. First the former may be easily translated into the lat- 
ter, since it suffices to expand substitutions through program abstractions. The 
translation from weak explicit substitutions to A-calculus is 

{x} = X 

{MN} = {M}{N} 

{(Ax.P)[s]} = (Ax.P)[[{41 

{(xi,Mi), (x 2,M2),- • • (x„,M„)} = Xi\{Mi},X2\{M2},- • -x„\{M„} 

We assume that no variable Xj is free in a term {Mi}, which can always be 
achieved by renaming the Xi variables. Thus, none of the [[xi\MiJ substitution 
interferes with another one and we may safely use the “parallel” substitution 
notation. 

Proposition 4. If M ^ M' , then {M} — » {M'}. 

Proof: By structural induction on M. The key point is that, given a closure 
(Ax.C[xi])[- • • (xi,M) ■ ■ •], either the context C[ ] does not bind Xi, or this oc- 
currence of Xi does not refer the binding {xi,M). □ 

The converse translation from A-calculus to explicit substitutions is a bit 
more involved. Several translations are possible for any given A-term M. We 
consider the translation with maximal substitutions. Let P be a A-term and Q 
a sub-term of P, so P = C[Q] with the context notation. Say that Q is a free 
sub-term of P, iff C[ ] does not bind Q. We will consider maximal free sub-terms 
of P. For instance, we underlined them in Xx.x{y{Xz.xz)) or in Xx.x{y{Xz.yz)). 



190 Jean-Jacques Levy and Luc Maranget 



Notice that, given any A-term P maximal free sub-terms are all disjoint. 
Hence, P can be written by using the natural generalization of the context 
notation to n holes. We get: P = C[xi,X2,- “ Xn]^xi\Pi,X2\P2, - ‘ ‘ ^:n\Pn]\, 
where Pi, P2,. ■ ■ Pn are the maximal free sub-terms of P and x\, X2,- ■ ■ x„ 
are fresh variables all distinct. The translation from the weak A-calculus to weak 
explicit substitutions is as follows 

P{x) = X 

I( PQ) = X( P)I[Q) 

I{Xx.P) = {Xx.C[xi,X2, ■ ■ ■ a;„])[(a:i,I(Pi)), {x2,I{P2)), ■ ■ ■ (a;„,X(P„)) 

where Pi, P2,. . . , Pn are the maximal free sub-terms of P. 



Proposition 5. Given a X term P, we have {T{P)} = P. 

Proof: Easy, the choice of fresh variables for the Xi’s is obviously crucial. □ 
It is no surprise that the converse proposition does not hold, since 2 depends 
on which sub-terms are “abstracted out”. Consider M = (Aa;./)[ ], then we get 
I{{M}) = iXx.y)[{y, ![])]. 

Proposition 6. IfP^ P' , then 2 {P) M , with {M} = P' 

Proof: By structural induction. The key observation is as follows: let R the 
redex contracted in the reduction P P', since i? is a redex there exists Qi, the 
maximal free sub-term of P that includes R. Then, we can apply the induction 
hypothesis to □ 



4 Reduction Strategies with Weak Explicit Substitutions 



We consider three different evaluation strategies and show how they are naturally 
connected to executions of A-interpreters. We start by call- by- value in weak 
explicit substitutions, which works on the following subset of terms 



M, N ::= term 

X variable 

I MN application 

I (Aa:.P)[s„] closure 

P, Q ■.:= x\ PQ I Xx.P programs 



V ::= a; I (Aa:.P)[s„] values 

Sv ::= (xi,Vi), (x2,V2), ■ ■ ■ (x„, P„) Xi distinct (n > 0) 



Values are either variables or closures. Notice that we take the convention that 
a;ViV2 • • • El is not a value when n > 0. We could have taken a different conven- 
tion, but it would have just complicated our semantics without much interest. 
We could also have decided that x was not a value, but this seems more speak- 
ing, especially when one adds constants to the set of terms. Now functions need 
values as arguments in the following /3„ reduction rule. 




Explicit Substitutions and Programming Languages 191 



(/3„) (Aa;. P)[s„] V P[[s„[a;\y] ]] 

For active contexts, it is sufficient to consider ^ and v rules, since ^ and cr rules 
can never be applied since s^-substitutions are irreductible. 

We first notice that redexes in the call-by-value strategy are innermost re- 
dexes in the calculus of weak explicit substitutions, but not all of them since we 
are interesting to reductions leading to values. An alternative way of express- 
ing this strategy can be done with a bigstep SOS semantics and sequents of 
form s \- P = V , meaning that the result of evaluating P with substitution s is 
value V, as follows 

s, (x,V) X = V 
s X = X {x ^ domain(s)) 
s h Xx.P = (Aa;.P)[s] 

shP= (Ax.P')[s'] shQ = W s'[x\V']h P' = V 
shPQ = V 

Proposition 7 . s \- P = V iff P[[s]] — ^ V in the calculus of call-by-value weak 
explicit substitutions. 

Proof: The proof is obvious by induction on the pair (/,||P||) where I is the 
length of the reduction and ||P|| is the size of P. □ 

Obviously there are similar statements with call-by-name. The set of terms 
is now the full set of terms in the calculus of weak explicit substitutions. Values 
are variables or closures. The strategy is defined as the normal reduction 
which always contracts the leftmost outermost redex until reaching a value. The 
corresponding bigstep semantics is 

s' h P = V 
s, {x, (P, s')) \- x = V 

s\- x = x {x ^ domain{s)) 
s h Xx.P = (Ax.P)[s] 

s h P = (Ax.P')[s'] s'[a;\(Q, s)] h P' = P 

s h PQ = V 

In fact, to model call- by-name, our bigstep semantics needed a new kind of 
delayed substitutions. Now substitutions may contained pairs (Q, s) for any pro- 
gram Q (and not only for program abstractions), which somehow correspond 
to the Algol 60 “thunks”. A close treatment of them could be done with the 
calculus of gradual weak explicit substitutions exposed in section 6. A value of 
the bigstep semantics can be mapped to a value of the call-by-name calculus of 
weak explicit substitutions by replacing all sub-terms of form (Q, s) by terms 
Q[[s]]. Let V'^ be the value mapped from V. 



192 Jean-Jacques Levy and Luc Maranget 



Proposition 8 . s \- P = V iff P[[s]] nornl^ calculus of weak explicit 

substitutions. 

The proof is similar to the one of proposition 7. Notice that by the standard- 
ization theorem, it is possible to show that the value computed by call-by-name 
is minimal, namely if M V, then M » Vn V. 

7 j 7 norm ^ 



Call-by-need is more delicate, since one must represent some sharing of terms. 
Following techniques in [24,5,25,27,26], we build a confluent theory of shared 
reductions as follows. Terms and programs are labeled with names, names for 
programs are single letters a, b, c . . . taken in a given alphabet, names for terms 
are strings a of letters which can be many-level underlined. 



M,N :■.= x“ I {MN)°‘ \ (Ax.P)“[sj 
P,Q::= I (PQ)^ | {Xx.P)^ 

s ::= (xi. Ml), (x 2 , M 2 ), • • • (x„, M„) 
a, /3 ::= a \ al3 \ a 



labeled term 
labeled programs 
Xi distinct (n > 0) 
labels 



The new labeled reduction rule is defined as 



iff) ((Ax.P)“[s]iV)^ ^/3- (aoP)[[s[x\iV]]] 



where the labeled substitution [[ ]] is defined inductively as follows in a labeled 
program 

x^[[s]| = j3 ■ s(x) if X G domain{s) 
x^ [[sj = x^ otherwise 
(PQ)/5[[sl = (PWQW)/5 
(Ax.P)^[[sl = (Ax.P)'3[sj 

a ■ x^ = x“^ a o x^ = x“^ 

a • (MiV)^ = a o (PQ)*' = ((a o P) (a o Q))“'' 

a ■ (Ax.P)^[s] = (Ax.P)“^[s] a o (Ax.P)^ = (Ax.P)“^ 

Above, we used two external operations with labels and labeled terms to modify 
the external label of a term or to broadcast a label on a program. Notice that 
this labeled calculus is different from the labeled A-calculus as in [.3,24,25], which 
does not contain the broadcast operation. This new operation means that in the 
weak /3-reduction we need fresh copies of application nodes from the body of 
the function before its application to an argument. However, it does not copy 
abstractions, and instead builds new closures. 



Proposition 9. In the labeled calculus of weak explicit substitutions, the follow- 
ing three lemmas hold 



(a/3) • M = a ■ f3 ■ M 
M ^ M' ^ a- M ^ a- M' 
s — > s' -^[[s]] ^ -^[[sl] 



(^) 

(li) 

(Hi) 



Explicit Substitutions and Programming Languages 193 



Proof: (i) is obvious by definition. And (z) => (ii) (Hi). □ 

Proposition 10. The labeled caleulus of weak explicit substitutions is confluent. 

Proof: As in previous confluence proofs, we only consider local confluence, leav- 
ing the remaining part of the proof to the Tait-Martin-Lof’s axiomatic method. 
When s s' and N N' , we have two interesting cases corresponding to the 

two diagrams 



((Aa;.P)“[s] N)l^ ^ ((Aa;.P)“[s'] A^)^ 

fl ■ (ao P)[[s[a;\A^] ]] -^ /3 • (a o P)[[s'[a:\7V] ]] 

and 

((As.P)“[s] N)!^ ^ ((Aa;.P)“[s] 7V')^ 

fl ■ (ao P)[[s[a;\A^] ]] -»■ /3 • (a o P)[[s[a;\A^'] ]] 

which commutes by using lemmas of proposition 9. □ 

As in other theories of labeled A-calculi, labels are useful for naming redexes, 
which are either residuals of redexes in a given initial term M, or created along 
reductions. The name of a redex is the string of labels on the path from its 
application node to its abstraction node, thus naming the interaction between 
these two nodes. So, the name of ((Aa:.P)“[s] N)^ is a. A complete labeled re- 
duction step M N is the finite development of all redexes with name a in M. 
We write for a complete anonymous step, and ==» for several steps. Finally 
Init(M) will be the predicate which is true iff every label in M is a distinct letter. 
Intuitively, Init(M) means that term M does not contain shared sub-terms. 

Proposition 11. Let Init(Mo) and Mq =» M . If M N and M M' , 
then N TV' and M' N' for some N' . 

Proof: The proof is far too complex to be exposed in this article. □ 

This property is nice since its shows that one has a confluent sub-theory of weak 
complete reductions. 

Call-by-need strategies =^ 5 ^^ correspond to complete normal reductions 
in the labeled calculus of weak explicit substitutions when the initial labeled 
term M checks predicate Init(M). At each step, all redexes with the same name 
as the one of the leftmost outermost redex are contracted. One can show that 
the number of steps to get a value with this reduction is always minimal. No 
other reductions may get quicker any value. In difference with the theory of full 
A-calculus, there is always a simple reduction — » which can get the value with 
the same cost. 

Now the goal is to define a bigstep semantics for call-by-need. We make 
sharing explicit by considering stores S, which are mappings from locations 
t to either thunks (P, s) or values V. Substitutions s now binds variables to 



194 Jean-Jacques Levy and Luc Maranget 



locations. A store may appear as a context or as result in judgments all of the 
form s • E \- P = V • S' . Accessing to the value of a variable is now a bit more 
complex, since the value of its location can be of two kinds. When a value, it is as 
before. When a thunk, one has to evaluate it, and to modify the corresponding 
location in the store before returning the value and the modified store. 



s' . r, {£, (P, s')) hP = v» S', [I, (P, m 
s, (x, i) • r, (£, (P, s')) h a; = 14 • A', (^, V) 

s, (x, i) • S,{£,V) \- X = V • (£, V) 

s9E\-x = xuE {x ^ domain(s)) 

s • A h Xx.P = (Aa;.P)[s] • A 

s • A h P = (Aa;.P')[s'] • A' s'[a;\£] • A', {£, (Q, s)) h P' = V • A" 
s • A h P<5 = • A" 

In the last rule, we assume that ^ is a fresh location. 

Notice that, by contrast with Launchbury’s [21], there is no need for renaming 
while substituting a variable in the first rule. No capture of variables can occur 
here. Another difference is that we make a clear distinction between variables 
and locations. Let now write M*, P* , s* for the unlabeled terms, programs 
and substitutions obtained by erasing all labels within the labeled terms M, 
programs P, and substitutions s. Let also use the V'^ notation defined in the 
call- by-name subsection. 



Proposition 12. Let InU{P[[s\). Then s* • % ^ P* = V* * E iff P[[s]] 

A+ in the labeled calculus of weak explicit substitutions. 

Proof: The proof is again much complex to be presented here. □ 



5 Runtime Interpreters 

Sets may be represented by lists with two constructors nil and cons and 
substitutions may become association lists. Substitutions may be then called 
environments. We do not duplicate the corresponding SOS of call-by-value with 
environments, which leads to the following recursive A-evaluator 

eval{x, {x, V) :: s) = V 
eval{x, {y, V) :: s) = eval{x, s) 
eval{x, nil) = x 
eval{Xx.P, s) = (Aa;.P)[s] 

eval{PQ, s) = eval{P' , (x, eval{Q, s)) :: s') if eval{P, s) = (Aa;.P')[s'] 



Explicit Substitutions and Programming Languages 195 



Similarly with call-by-name, one gets the functional interpreter 

eval{x, {x, {P, s')) :: s) = eval(P, s') 
eval(x, {y, (P, s')) :: s) = eval{x, s) 
eval{x, nil) = x 
eval{Xx.P, s) = (Aa;.P)[s] 

eval\pQ, s) = eval{P' , {x, {Q, s)) :: s') if eval{P,s) = (Ax.P')[s'] 

Finally, the third interpreter is for call-by-need 

eval{x, (x,£)) = V if !^ = (P, s') and V = eval{P, s') (side effect I ^V) 

eval{x, (x,£)) = V li \£ = V 

eval{x,{y,£) :: s) = eval(x,s) 

eval(x, nil) = x 

eval{Xx.P, s) = (Ax.P)[s] 

eval\pQ, s) = eval{P',{x,£) :: s') if eval{P,s) = (Ax.P')[s'] 

and £ = ref(Q, s) 

In the last case, we use mutable variables £ with the ML syntax for creation the 
reference £ and access to its content !£. The three interpreters are easy mappings 
of the previous SOS rules seen in the previous section, and their soundness with 
respect to this operational semantics is straightforward. 

We now consider two problems on functional interpreters, and try more to 
state them within weak explicit substitutions than to give solutions, far beyond 
the scope of this paper. 

The first one is stack allocation for closures. It is well-known that functional 
languages cannot be implemented with the Algol/Pascal stack discipline, since 
closures often need to be heap-allocated. In these imperative languages, each 
environment cell has an extra component, a link to the outer environment, also 
named “static link” in the compiler terminology. The stack is now represented 
by the environment at the left of each rule in our bigstep operational semantics. 
In the call- by- value case, the SOS semantics is now as follows 

s, (n, (a;, P)) h x = V 

s[l..n] \- x = V 

s,{n,{y,V')) \- x = V 

s h Xx.P = {Xx.P)[ |s| ] 

s h P = (Ax.P')[n] s h Q = V' s, (n, (x, V')) P' = V 
shPQ = V 

where |s| is the length of s, and s[l..n] is the list of the first n elements of s. 
Intuitively, in the Algol/Pascal subset of terms, the values of arguments of func- 
tions can only refer to environments already in the stack. This is why closure 
values are represented as (Ax.P)[n] where n refers to an entry in the current 
environment. Now it remains to show formally that this works. Clearly, it fails 




196 Jean-Jacques Levy and Luc Maranget 



for any currified function. Take for instance, s h (\x.\y.x)Q = (Ay.x)[|s| + 1] 
which yields a result escaping from stack s. 

In this implementation of environments with stacks, we need to leave the 
stack unchanged after the evaluation of each application, which makes the eval- 
uation of currified functions failing. There are other techniques with stacks, dy- 
namic ones as in Caml [22] when functions have arities, or with a static escape 
analysis as in [6]. The trick is then to try to keep the stack unchanged only after 
the evaluation of function bodies. 

The second problem that we consider in this section is graph implementations 
of functional languages, which are mainly useful for lazy languages. So we are 
in the case of weak explicit substitutions with call-by-name. The weak labeled 
calculus of section 4 can be used to characterize these graphs. Clearly in the 
A-rule, 

{(3i) ((Ax.P)“[s]iV)^ ^/3- (aoP)[[s[x\iV]]] 

the broadcast operation aoP describes the creation of a fresh copy of the function 
body (except for its abstraction sub-terms). The rest of the term in which the 
/3-reduction is performed remains unchanged, with the same sharing as before 
the reduction step. This operation was already considered by Wadsworth in his 
dissertation, but in the context of sharing for the full A-calculus. In our case, an 
intuitive graph /3-rule would be written 

(/3g) {{Xx.P)[s]N) (copy (P))[[s[x\/Vj J 

However, it is fascinating how the proof of the correspondence between the la- 
beled calculus and the straightforward graph implementation is complex. The 
goal is to prove that nodes connected by a same labeled path are identical in the 
graph implementation. Notice that this proof was already quite involved in the 
full A-calculus where it required to build the so-called context semantics [16]. 
But one could expect a much simpler proof in the weak case. 

6 Weak Explicit Substitutions with Ministep Semantics 

Usually, the various authors working on explicit substitutions start here. Their 
goal is to represent not only bindings of variables, but also the way how substi- 
tutions are gradually pushed inside programs. Often, bindings are treated with 
de Bruijn numbers. Notice that we never used them, since names of variables 
were sufficient. This is because, in our weak calculi, substitution never cross 
binders, namely A-abstractions or other substitutions. Therefore, one has not to 
care with a-renaming of bound variables. The second part of the motivation of 
the usual work on explicit substitutions is to study the progression of substitu- 
tions in terms, which we avoided since substitutions are always pushed to (free) 
variables or abstractions. 

We consider the following ministep semantics for the calculus of weak ex- 
plicit substitutions. The set of terms now allows closures on any program, and 



Explicit Substitutions and Programming Languages 197 



substitutions are represented by association lists. 



M,iV 



X 



MN 

P[s] 



term 
variable 
application 
extended closure 



P, Q x\ PQ I Xx.P programs 



s ::= nil empty substitution 

I {x,M)-.:s association list 

The reduction rules are defined by the following five non-overlapping left-linear 
rewriting rules 

PQ[s] P[s]Q[s] 
x[{x, M) w s]^ M 
x[{y,M) :: s] ^ a;[s] 
x[nil] X 

(Aa;.P)[s]fV ^ P[{x,N) :: s] 

Any sub-term may be reduced, as described by the following definition of active 
contexts 

M ^ M' 

MN M'N MN MN' 



P[s] ^ P[s'] 



M ^ M' 



(x,M) :: s ^ {x,M') :: s {x,M) :: s ^ {x,M) :: s' 

We do not detail the different proofs in this new calculus. Notice that the vari- 
ables of this calculus (in the sense of term rewriting systems) are M, N, s. All 
other operators are constants. Thus this ministep calculus may be considered as 
an orthogonal system, and therefore is confluent, with the standardization theo- 
rem as stated in [29]. The normal strategy corresponds to the leftmost outermost 
reduction. And some simulation of the weak calculus of section 3 may easily be 
shown. Therefore this calculus also implements the weak A-calculus. Finally, one 
can investigate sharing within this ministep semantics as in [26,27], which leads 
to a ministep implementation of graph reduction, as considered at end of pre- 
vious section or in interpreters of functional lazy languages [31]. Remark that 
then sharing works for all the set of reduction rules and not only for the /3-rule. 



7 Conclusion 

So, before jumping in the full world of explicit substitutions we hope to have 
shown that the fundamental properties of the syntax of weak theories in the A- 
calculus or in the weak calculus of explicit substitutions are still interesting. In 



198 Jean-Jacques Levy and Luc Maranget 



this paper, we did not consider preservation of strong normalization in the typed 
case, but clearly it holds. We also restricted our study to the sole /3-reduction, 
because it contains many of the problems, but (5-rules may be added in each 
of the three main calculi considered here, which rapidly provides the power of 
Plotkin’s PCF or of a ML kernel. These extensions are rather easy since we have 
no critical pairs in the various term rewriting systems. Similarly data structures 
may be added (lists, records, algebraic structures) as done for records in [2]. 

We showed that our weak calculi are sufficiently expressive to describe the 
functional part of the programming languages runtimes, and could be used as a 
basis to model or to derive some program transformations or program analyses 
within compilers (stack allocation, graph implementation, dependency analysis, 
slicing) [6,18,2,15]. But it would be very interesting to understand whether the 
fundamental properties of the underlining calculi are useful. For instance, how 
much of confluence or of the standardization property is really used? Also we 
would like to understand which of the three calculi presented here is the most 
useful. 

Finally, many of the results of this paper were considered as folks theorems, 
rather easy to prove. We hope to have shown that some of the proofs may deserve 
some attention. In fact, some of them are not easy at all. 

Acknowledgments 

We thank Bruno Blanchet for stimulating discussions on stack allocation. 

References 

1. M. Abadi, L. Cardelli, P.-L. Curien, and J.-J. Levy. Explicit substitutions. Journal 
of Functional Programming, 6(2):pp. 299-327, 1996. 181, 181 

2. M. Abadi, B. Lampson, and J.-J. Levy. Analysis and caching of dependencies. In 
Proc. of the 1996 ACM SIGPLAN International Conference on Functional Pro- 
gramming, pages pp. 83-91. ACM Press, May 1996. 198, 198 

3. H. P. Barendregt. The Lambda Calculus, Its Syntax and Semantics. North-Holland, 
1981. 184, 192 

4. Z.-E.-A. Benaissa, D. Briaud, P. Lescanne, and J. Rouyer-Degli. Lambda-upsilon, 
a calculus of explicit substitution which preserves strong normalisation. Research 
Report 2477, Inria, 1995. 181 

5. G. Berry and J.-J. Levy. Minimal and optima 1 computations of recursive programs. 
In Journal of the ACM, volume 26. ACM Press, 1979. 192 

6. B. Blanchet. Escape analysis : Correctness proof, implementation and experi- 
mental results. In Proc. of 25th ACM Symposium on Principles of Programming 
Languages. ACM Press, 1998. 196, 198 

7. R. Bloo and K. H. Rose. Combinatory reduction systems with explict substitutions 
thatpreserve strong normalization. In In Proc. of the 1996 confence on Rewriting 
Techniques and Applications. Springer, 1996. 181 

8. N. Qagman and J. R. Bindley. Combinatory weak reduction in lambda calculus. 
Theoretical Computer Science, 198:pp. 239-249, 1998. 182, 183 



Explicit Substitutions and Programming Languages 199 



9. G. Cousineau, P.-L. Curien, and M. Mauny. The categorical abstract machine. In 
Proc. of the second international conference on Functional programming languages 
and computer architecture. ACM Press, 1985. 182 

10. P.-L. Curien. Categorical Comhinator, Sequential Algorithms and Functional Pro- 
gramming. Pitman, 1986. 181 

11. P.-L. Curien, T. Hardin, and J.-J. L’evy. Confluence properties of weak and strong 
calculi of explicit substitutions. Journal of the ACM, 43(2):pp. 362-397, 1996. 181 

12. R. David and B. Guillaume. The lambdaj calculus. In Proc. of the Second Inter- 
national Workshop on Explicit Substitutions: Theory and Applications to Programs 
and Proofs, 1999. 181 

13. R. Di Cosmo and D. Kesner. Strong normalization of explicit substitutions via 
cut elimitation in proof nets. In In Proc. of the 1997 symposium on Logics in 
Computer Science, 1997. 181 

14. G. Dowek, T. Hardin, and C. Kirchner. Higher-order unification via explicit sub- 
stitutions: the case of higher-order patterns. In M. Maher, editor. In proc. of the 
joint international conference and symphosium on Logic Programming, 1996. 182 

15. J. Field. On laziness and optimality in lambda interpreters: Tools for specification 
and analysis. In Proc. of the Seventeenth conference on Principles of Programming 
Languages, volume 6, pages pp. 1-15. ACM Press, 1990. 181, 198 

16. G. Gonthier, M. Abadi, and J.-J. Levy. The geometry of optimal lambda reduction. 
In Proc. of the Nineteenth conference on Principles of Programming Languages, 
volume 8. ACM Press, 1992. 196 

17. T. Hardin. Confluence results for the pure strong categorical logic ccl. lambda- 
calculi as subsystems of ccl. Journal of Theoretical Computer Science, 65:291-342, 
1989. 181 

18. T. Hardin, L. Maranget, and B. Pagano. Functional runtimes within the lambda- 
sigma calculus. Journal of Functional Programming, 8(2), march 1998. 198 

19. J. R. Hindley. Combinatory reductions and lambda reductions compared. Zeit. 
Math. Logik, 23:pp. 169-180, 1977. 183 

20. J.-W. Klop. Combinatory Reduction Systems. PhD thesis, Mathematisch Centrum, 
Amsterdam, 1980. 187 

21. J. Launchbury. A natural semantics for lazy evaluation. In Proc. of the 1993 
conference on Principles of Programming Languages. ACM Press, 1993. 194 

22. X. Leroy. The ZINC experiment: an economical implementation of the ML lan- 
guage. Technical report 117, INRIA, 1990. 182, 196 

23. P. Lescanne. From lambda-sigma to lambda-upsilon, a journey through calculi of 
explicit substitutions. In Proc. of the Twenty First conference on Principles of 
Programming Languages, 1994. 181 

24. J.-J. Levy. Reductions correctes et optimales dans le lambda- calcul. PhD thesis, 
Univ. of Paris 7, Paris, 1978. 192, 192 

25. J.-J. Levy. Optimal reductions in the lambda-calculus. In J. Seldin and J. Hind- 
ley, editors. To H.B. Curry: Essays on Combinatory Logic, Lambda- Calculus and 
Formalism. Academic Press, 1980. On the occasion of his 80th birthday. 192, 192 

26. L. Maranget. Optimal derivations in orthogonal term rewriting systems and in 
weak lambda calculi. In Proc. of the Eighteenth conference on Principles of Pro- 
gramming Languages. ACM Press, 1991. 192, 197 

27. L. Maranget. La strategic paresseuse. PhD thesis, Univ. of Paris 7, Paris, 1992. 
192, 197 

28. P.-A. Mellies. Typed lambda-calculus with explicit substitutions may not termi- 
nate. In Proc. of the Second conference on Typed Lambda- Calculi and Applications. 
Springer, 1995. LNCS 902. 181 



200 Jean-Jacques Levy and Luc Maranget 



29. P.-A. Mellies. Description Abstraite des Systemes de Reecriture. PhD thesis, Univ. 
of Paris 7, december 1996. 187, 188, 197 

30. R. Milner. Action calculi and the pi-calculus. In Proc. of the NATO Summer 
School on Logic and Computation. Marktoberdorf, 1993. 182 

31. S. L. Peyton-Jones. The implementation of Functional Programming Languages. 
Prentice-Hall, 1987. 197 



Approximation Algorithms for Routing and Call 
Scheduling in All-Optical Chains and Rings* 



Luca Becchetti^**, Miriam Di lanni^, and Alberto Marchetti-Spaccamela^ 

^ Technische Universitat Graz, Institut fur Matliematik B; 
lucaSopt .math. tu-graz . ac . at 

^ Dipartimento di Informatica e Sistemistica, Universita di Roma ”La Sapienza”; 

albertoOdis . uniromal . it 

® Dipartimento di Ingegneria Elettronica e dell’Informazione, Universita di Perugia; 

diianniOdiei . unipg . it 



Abstract. We study the problem of routing and scheduling requests of 
limited durations in an all-optical network. The task is servicing the re- 
quests, assigning each of them a starting time and a wavelength, with 
restrictions on the number of available wavelengths. The goal is minimiz- 
ing the overall time needed to serve all requests. We propose constant 
approximation algorithms for both ring and chain networks. In doing 
this, we also propose a polynomial-time approximation scheme for the 
problem of routing weighted calls on a directed ring with minimum load. 



1 Introduction 

All-optical networks allow very high transmission rates, widely exceeding those 
that can be guaranteed by traditional electronic technology. Wavelength Divi- 
sion Multiplexing (WDM) allows the concurrent transmission of multiple data 
streams on the same optic fiber; different data streams on the same optical link 
at the same time and in the same direction use different wavelengths (currently 
30-40 in experimental settings) [6,13]. Moreover, the high speed achievable with 
all-optical networks is mainly due to the fact that the signal is kept in optical 
form throughout its transmission from source to destination. 

In this paper, we address the problem of Call Scheduling in all-optical net- 
works, that is the problem of scheduling a set of communication requests (calls) 
each one characterized by a source-destination pair and a duration. Following [13] 
we assume that optical links allow only one way communication: if there is a link 
from X to y then the information flow is unidirectional from source to tail. In 
order to allow bidirectional connection, we assume that if there is a link from x 
to y there is also a link from y to a; [17]. A different model assumes that the 
optical links are undirected and allow bidirectional communication; we remark 
that this model is less realistic with respect to current technology [13]. 

* This work was partially supported by ESPRIT project ALCOM-IT and by the 
START program Y43-MAT of the Austrian Ministry of Science. 

** Part of this research was done while the author was at Dipartimento di Informatica 
e Sistemistica, University of Rome “La Sapienza”. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 201—213, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



202 



Luca Becchetti et al. 



Given a network and a set of calls Minimum Call Scheduling requires to 
assign a directed path, a wavelength and a starting time to each call, subject 
to the constraint that no pair of calls assigned to the same wavelength use the 
same (directed) arc at the same time and that the number of assigned wavelength 
does not exceed some bound k. The objective is to minimize the overall time to 
accomodate all calls. 

Related work. To the best of our knowledge, call scheduling in all-optical 
networks received so far little attention. 

Concerning optical networks, several related problems have been extensively 
studied. Min Path Coloring [.3,4,15,17] aims to find routes and a wavelength 
assignment for a set of calls of infinite duration that minimize the number of 
used wavelengths. In the Max Call admission problem calls are presented in 
an on-line fashion; when a call arrives it can be either rejected or accepted; in 
the latter case it must be immediately satisfied using available resources. The 
objective is maximizing the number of accepted calls [2,3]. 

In the Min Ring loading problem, arising in the project of SONET net- 
works [18], the aim is that of devising a routing of calls in a ring network such 
that the maximum load of a link (defined as the sum of the durations of all 
calls that use the link) is minimized. This problem is polynomial-time solvable 
under the assumption that all calls have unit durations [19,18]. The undirected 
ring loading problem (i.e., each link allows two ways communication) has been 
considered in [18], where a constant approximation algorithm is presented. This 
result has been improved and a polynomial time approximation scheme has been 
proposed [14]. The case of links that allow one way communication is the Mini- 
mum Weighted Directed Ring Routing (min-wdrr) problem and will be 
considered in the present paper. 

Scheduling calls with minimum makespan has been considered in non optical 
networks. Concerning packet switched networks, a seminal result by Leighton [16] 
proved the existence of a schedule delivering all packets in a number of steps 
within a constant factor of the lower bound. In [7,8] the call scheduling problem 
has been studied in the context of ATM networks. In particular, the authors 
propose approximation algorithms for star and ring networks. 

Results of the paper. In section 3 we propose approximation algorithms for 
call scheduling in chain networks. Namely we propose a 3-approximation al- 
gorithm for the call scheduling problem in chains when just one wavelength 
is available, by a reduction to the problem of Minimum Dynamic Storage 
Allocation (min-dsa) for which a 3-approximation algorithm is known [11]. 
Successively, we extend the algorithm to the general case of k available wave- 
lengths, obtaining a 5-approximation algorithm. The above results hold also in 
the case of undirected chains. 

In section 4 we first give a polynomial-time approximation scheme for Min- 
imum Weighted Directed Ring Routing. This result, together with those 
of Section 3, allows to obtain a (12 -|- e)-approximation algorithm for the call 
scheduling problem in ring networks. We remark that the problem of routing 



Approximation Algorithms for Routing and Call Scheduling 203 



weighted calls in a ring network with the aim of minimizing the maximum load, 
has itself a practical relevance in SONET networks [18,14,19]. 

The results of section 3 hold also in the case of undirected chains. For the 
sake of brevity, most proofs are omitted. 

2 Preliminaries 

In the following, G = (E, E) denotes a simple, directed graph on the vertex set 
V = {uo, ui, . . . , Um-i} such that arc (vi,Vj) exists if and only if arc (vj,Vi) 
exists. A network is a pair (G, k) where k is the number of available wavelengths 
(from now on, colors) on each arc. A call C = [s, d, 1] is an ordered pair of vertices 
s, d completed by an integer I > 0 representing the call duration. Namely, s is 
the source vertex originating the data stream to be sent to the destination d. 

Given a network {G,k) and a set C = {Ch = [s/i,dh,^?i] : = 1, . . . , n} of 

calls, a routing is a function R : C ^ where V{G) is the set of simple 

paths in G. Given a network {G,k), a set C of calls and a routing R for C, a 
schedule is an assignment of starting times and colors to calls such that at every 
time no two calls with the same color use the same arc; formally, a schedule is 
a pair {S, F), with S' : C — > [0,1,... a function assigning starting times 

to calls and F:C— >{l,...fc}a function assigning colors to calls. S and F must 
be such that, for any (m, v) € E and Gi, Gj S C, if (u, v) S R{Gi), {u, v) € R{Cj) 
and F{G,) = F{Gj) then either S(Ci) > S{Gj) + Ij or S{Gj) > S(Ci) + k. 
T = maxi<?t<„{S([sft,, is the makespan of schedule (S,F). T*{x) or 

simply T* denotes the makespan of an optimal solution for an instance x.. 

In the Minimum Call Scheduling (min-cs) problem it is required to find 
a routing and a schedule for an instance ((G, k),C) having minimum makespan. 
Since the problem is trivial if /c > |C|, we assume k < \C\. Notice that, if routes 
are fixed, the problem of scheduling calls in order to minimize the makespan is 
unapproximable within 0(|C|^“‘^), for any d > 0 [5] in general networks. 

Given an instance of MIN-CS and a routing R, the load Lfi{e) of arc e is defined 
as the sum of the durations of calls using it. If fc = 1, then L* = min/{(maxe L{e)) 
is a natural lower bound to the makespan of any schedule. 

In the following we show that MIN-CS in chain networks is closely related 
to Minimum Dynamic Storage Allocation (min-dsa), that requires the 
allocation of contiguous areas in a linear storage device in order to minimize 
the overall requested storage space. A block B = (/, e, z) represents a storage 
requirement such that / and e are, respectively, the first and last time instants 
in which block B must be allocated and z is an integer denoting the memory 
requirement (size) of B. An instance of min-dsa consists of a set of blocks 
B = {Bi = (/i, ei, zi), . . . , = {fn,en, Zn)}', an allocation of S is a function 

assigning blocks to storage locations, such that if both Bi and Bj must stay at the 
same time in the storage device then they must be assigned to non overlapping 
portions. Formally, an allocation is a function F : B ^ N such that, for any 
pair Bi, Bj of blocks, if fi < fj < Ci or fj < fi < Cj then either F{Bi) + Zi~ 1 < 
F{Bj) or F{Bj) + Zj — 1 < F{Bi). min-dsa is defined as follows: given an infinite 



204 



Luca Becchetti et al. 



array of storage cells and a set B of blocks, find an allocation F such that the 
storage size M = maxi</i<„{F(_B/i) + Zh ] is minimum, min-dsa is NP-hard [9] 
but approximable within a constant [10,11]. 



3 Call Scheduling in Chain Networks 

In this section we assume that the network is a chain with m vertices, simply 
denoted as 0, 1, . . . , m — 1. The restriction of min-CS to chain networks will be 
denoted as CHAIN-min-CS (or CHAiN-MiN-CSfc, when k is the number of available 
colors). Since the network is a chain, there is only one path between each source- 
destination pair. This implies that routing of calls is fixed and that the set C of 
calls consists of two independent subsets Ci = {[sn, dn, hi], ■ ■ ■ , [sini , , hni]} 

and C2 = {[S2i,d2i,hi], ■ ■ • , [S2n2,c?2ri2,^2n2]}, where ni + U2 = n and, for any 
h = 1, . . . , ni, sih < dih while, for any h = 1, . . . , ri2, S2h > d,2h- This implicitely 
defines two independent subinstances for each instance of the problem. Without 
loss of generality, in the following we always refer to just the first of them. The 
main result of this section is a 5 -approximation algorithm for MIN-CS. This result 
exploits a close relationship between min-CSi and min-dsa, which is stated in 
the next theorem: 

Theorem 1. There exists a polynomial-time reduetion from CHAIN-MIN-CSi to 
MIN-DSA sueh that an instance of the first problem admits a schedule with 
makespan T if and only if the corresponding instance of the second problem 
admits a storage allocation with size T. 

The reverse reduction also holds, but we do not need it here. Our first ap- 
proximation result is a direct consequence of theorem 1 and the 3-approximation 
algorithm for min-dsa given in [11]. 

Corollary 1. There exists a 3- approximation algorithm for c/iain-MiN-CSi . 

We now turn to the general case. The algorithm we propose is sketched below: 
the first step rounds up call durations to the closest power of 2; this worsens the 
approximation of the makespan of the optimum schedule by a factor at most 2. 

Algorithm chain-cs 

input: chain graph G, set C of calls, integer k\ 

begin 

1 . for h := 1 to n do Round Ih to the closest upper power of 2 ; 

2 . Find a pseudo-schedule assuming only 1 available color; 

3 . Assign calls to colors; 

4 . Find a schedule separately for each color; 

end. 

We assume that call durations are powers of 2. Assuming only one color, in 
the sequel we will use the following interpretation: a call C = [s,d, Z] scheduled 
in interval [t,t 1] can be graphically represented as a rectangle of height I, 



Approximation Algorithms for Routing and Call Scheduling 205 




Fig. 1. Example of pseudo-schedule for 1 color 



whose ^-coordinates vary in [s, d] and whose ^-coordinates vary in [t, t + 1]. We 
now describe steps 2-4 more in detail. 

Find a pseudo-schedule. 

We first determine a pseudo-schedule for C, assuming only 1 color is available. 
A pseudo-schedule is an assignment PS : C {0, . . -J2h=i ^h} of starting times 
to calls, such that each arc may be used by more than one call at the same time. 
The length of PS with respect to the set C of calls is max^g{i_ „}{P5'(Ch)-|-^;i}. 

To our purposes, we need a pseudo-schedule in which at most two calls may 
use the same arc at the same time, based on the following definition: assume 
that, for some h < n — 1 , we have a pseudo-schedule for calls Ci, . . . ,Ch; 
we say that Ch+i is stable [ 10 ] at time t with respect to the pseudo-schedule 
of Cl , . . . , C/i if and only if and only if there exists an arc e such that C/i+i uses e 
and, for every instant 0,...,t— l,eis used by at least one call in {Ci , . . . , C/i}. In 
order to obtain such a pseudo-schedule, we proceed inductively as follows: i) calls 
are first ordered according to non-increasing durations (i.e. if j > i then li > Ij); 
ii) Cl is assigned starting time 0; iii ) assuming Ci, . . . , C^ have each been as- 
signed a starting time, Ch-i-i is assigned the maximum starting time such that 
it is stable with respect to the pseudo-schedule of Ci, . . . ,Ch- 

Under the assumption that call durations are powers of 2 it is possible to 
show [ 10 ] that this choice leads to a pseudo-schedule in which at most two calls 
use the same arc at the same time. An example of a pseudo-schedule is presented 
in Figure 1 where, as described above, each call Ch = [sh,dh,lh] corresponds to 
a rectangle of height Ih whose horizontal coordinates vary between Sh and dh- 
In the following we use PS to denote the particular pseudo-schedule con- 
structed above. Note that the notion of stability implies that the length of PS 
is a lower bound to the makespan of the optimal schedule for one color. 

Assign calls to colors. 

We partition the set of calls in the pseudo-schedule into 7 subsets Fi, F 2 , . . . Fy 
called stripes, by “slicing” the pseudo-schedule into horizontal stripes (figure 1 ). 

Stripes are defined inductively as follows: i) let Ch^ be the longest call starting 
at time 0 in the pseudo-schedule. Then stripe F\ has height A\ = Ih^ and includes 
all calls that in PS start not after Ai \ ii) assume that stripes F\, . . . , Fi_i have 



206 



Luca Becchetti et al. 



been defined (i > 2) and that there are still calls not assigned to a stripe, let t 
be the time at which the last call in Fi_i ends and let Chi t>e the longest call 
starting at t in the pseudo-schedule PS. Then stripe Fi has height Ai = Ihi 
and includes all calls that start not before t and end not after t + Ihi- Figure 1 
illustrates an example of slicing of a pseudo-schedule into stripes. 

Let F = {FiiF 2 , . . . Fj} be defined by the construction given above. The 
next lemma states that F defines a partition of C 

Lemma 1. Every call in C belongs to one and only one set in F . 

Stripes (and the corresponding calls) are then assigned to colors by defining 
a proper instance x of MULTIPROCESSOR scheduling with k identical machines 
with the goal of minimizing the total makespan. Roughly, jobs of instance x 
correspond to the sets {Fi,F 2 , . . . F^} of the partition F obtained above. Namely, 
with each stripe of height A we associate a job of the same size. 

We use the LPT {Longest Processing Time first) rule proposed by Gra- 
ham [12] to solve the obtained instance of multiprocessor scheduling: jobs 
are first ordered according to decreasing sizes and then greedily assigned to 
the processors (i.e. a job is assigned to the less heavily loaded processor). The 
obtained solution yields an assignment of calls to colors, defined as follows: if 
stripe Fi is assigned to machine j, then all calls in Fi are assigned color j. 
Now, for each color j, we derive a pseudo-schedule PSj of the calls that have 
been assigned color j in the following way: assume machine j has been assigned 
stripes Fj^, , Fj^ (in this order) and let C be a call assigned to color j and 
belonging to stripe Fj^, p G {1, . . . ,r}. Recall that PS{C) is the starting time 
of C in the pseudo-schedule obtained in step 2 of chain-cs. The starting time 
PSj (C) of C in the pseudo-schedule for color j is given by the sum of the heights 
of stripes . . . ,F)p-i plus the offset of C in its stripe Fj^. Namely, we have 

that PS, {C) = A. + PS{C) - EtT' 

Figure 2 illustrates the effect of Assign- Calls-To- Colors over the pseudo- 
schedule of figure 1 in the case of 2 colors. In particular, call C € F3 in figure 1 
has been assigned color 2 and PS 2 {C) = A 2 + PS{C) — {A\ A 2 ). 

We now analyze the makespan of the pseudo-schedule. The assumption that 
the duration of each call is a power of 2 and the characteristics of the pseudo- 
schedule obtained in the first step of the algorithm allow a tight analysis of 
LPT, yielding the next Lemma. For any color j = 1, . . . , fc, let be the length 
of the associated pseudo-schedule after procedure Assign- Calls- To- Colors and 
let Tps = maXjT^ (see fig. 2). Finally, let T* denote the makespan of an optimal 
schedule for the instance under consideration. 

Lemma 2. Tps < T* . 

Find a Schedule. 

For every color j = 1, . . . , A:, let PSj denote the pseudo-schedule obtained after 
procedure Assign- Calls-To- Colors has been run and let C, denote the set of calls 
that are assigned color j . It remains to derive a proper schedule Sj from PSj , for 
any color j = 1, . . . ,k. The relationship between min-dsa and CHAIN-min-CSi 
stated in theorem 1 implies the following lemma. 



Fi 

Fi 



Approximation Algorithms for Routing and Call Scheduling 207 



Color 1 



Ft, 

Ft 

F2 



Overlapping rectangles 



Fig. 2. Example of pseudo-schedule for 2 colors (Tps = = T'^ in this case) 



Lemma 3. [10] If call durations are powers of 2 then there exists an algorithm 
for scheduling calls in PSj with makespan at most 5/2 times the length of PS j. 

Lemmas 1, 2 and 3 imply the following theorem. 

Theorem 2. Algorithm chain-min-cs finds a schedule whose makespan is at 
most 5 times the optimum. 

Notice that the same result above holds also in the case of undirected chains. 

4 Call Scheduling in Ring Networks 

In the sequel, Em denotes a directed ring with m vertices, clockwise indexed 
0, . . . , m— 1. The arc set is {(u, (u-l-l)mod m), (u, (u— l)mod m) : 0 < i < to— 1}, 
where {v, (u-h l)mod to) (respectively {v, {v — l)mod to)) denotes the clockwise 
(counter clockwise) directed arc between vertices v and (v + l)mod to (v and 
(v — l)mod to). In the following we perform operations modulo to and we write 
V + 1 (u — 1) for (v + l)mod to (respectively (v — l)mod to). We define [s, t] to 
be {m|s < u < t} if s < t, [s, to — 1] U [0, t] otherwise. 

A call in a ring can be routed clockwise or counter clockwise. We first find a 
routing of the calls minimizing the load and then we schedule them by using a 
constant approximation algorithm based on the results of Section 3. 

4.1 Routing with Minimum Load 

Given a directed ring Em and a set C of calls. Minimum Weighted Directed 
Ring Routing (min-wdrr) is the problem of routing calls in C so that the 
maximum load on the arcs is minimized. {By), v = 0,1,. ..,to — 1, denotes 



208 



Luca Becchetti et al. 



the load on arc {v, u + 1) (arc (u + 1, v)). With each call Ch € C we can associate a 
binary variable Xh such that = 1 if Ch is routed clockwise, Xh = 0 otherwise. 
Under these assumptions, MIN-wdrr can be formally described as follows: 
min L = max{max„ Ay, maxy By} 

Ay ^hXhl By ^/l(l Xfi'), Xfi {0, 1}, 1 ^ ^ ^ U. 



OPT denotes the value of an optimal solution for any given instance of MIN- 
WDRR. MIN-WDRR is polynomial-time solvable when Ih = 1, h = 1,2, ... ,n [19]. 

Theorem 3. MIN-wdrr is NP-hard. 



Lemma 4. There is a polynomial-time algorithm that solves weighted di- 
rected RING ROUTING with load at most twice the optimum. 

Sketch of the proof. We solve an instance of Multicommodity Flow [1] ob- 
tained by relaxing the integrality constraints in the formulation of min-wdrr 
above. Let {x}, ... a;* } be the fractional optimal solution: the hth component 
is rounded to 0 if its value is < 1/2, to 1 otherwise. □ 

We now use Lemma 4 to obtain a polynomial-time approximation scheme 
(PTAS) for WDRR. Let L be the load obtained by applying the algorithm de- 
scribed above. In the following, e denotes any fixed positive number. Follow- 
ing [14], a call is routed long-way if it uses the longer path to connect its end 
vertices (ties are broken arbitrarily), it is routed short-way otherwise; futher- 
more, a call Ch is heavy if Ih > ^L/i, it is light otherwise. The number of heavy 
calls routed long way is bounded, as stated in the following lemma. 

Lemma 5. In any optimal solution there are at most Vlje heavy, long-way 
routed calls. 

Let H C C denote the set of heavy calls. For any set S C H, let LPs denote 
the following linear program: 

min L{S) = max{maxi, Ay, max^, By} 

Ay — ^ ] lh.Xh.-\-ay, By — ^/i(l Xh^-\~by , Xh ^ [0, 1 ], 1 ^ ^ ^ n. 

h:[u,'u + l]C[sfc,t;,] h:[y + l,v]C[th,Sh] 



Oy {by) denotes the load on clockwise (counter clockwise) directed arc ( t , t -|- 1 ) 
{{v -\- l,v)) resulting from routing long- way calls belonging to S and short- way 
calls belonging to H — S. Finally, L*{S) denotes the value of the optimal (frac- 
tional) solution of LP*{S). We now give the PTAS for MIN-wdrr: 



Approximation Algorithms for Routing and Call Scheduling 209 



Algorithm WDRR-PAS(e) 
input: 

output: routing for {Sm,C}\ 
begin (1) 

for each S' C : |S'| < 12/e do 
begin (2) 

for each call C £ H do 

if C G S then route C long-way 
else route C short-way; 
solve LPs 
end(2); 

Let y be the solution corresponding to S' such that L*{S') = mins L* (S); 
Apply a rounding procedure to y to a feasible routing x with load L{S'); 
Output X 
end(l). 



Lemma 6 below shows that, for every S C H, |S| < 12/e, we can round the 
solution of LPs in such a way that the value L{S) of the integer solution obtained 
differs at most eL/2 from L*{S). In order to prove the lemma, following [19] we 
first give the following definition: two calls Ch = [sh,th,lh] and Cj = [sj,tj,lj] 
are parallel if the intervals [sji,ih] and [tj,Sj] or the intervals [th,Sh] and [sj,tj] 
intersect at most at their endpoints. It may eventually happen that Sh = Sj 
and th = tj. Viewing an arc also as a chord, a demand is parallel to an arc when 
it can be routed through that arc, otherwise it is parallel to that arc’s reverse. 
Finally, for any optimal solution x* = {sJ, . . . , x* } to LP(S'), call Ch is said 
split with respect to a;* if 0 < < 1 . 

Lemma 6. For any S C H, [S'] < 12/e, there is a polynomial time rounding 
proeedure of the solution of LPs, such that L{S) — L*(S) < eL/2. 

Proof. Given S C H, [S'] < 12/e, let x* be a solution to LPs with value L*{S). 
Following [19] we first obtain a new fractional solution x = {x\, . . . ,Xn} such 
that no pair of parallel calls are both split and whose value L{S) is not larger 
than L*{S). Let us assume that there is a pair Ci = [s/i, th, Ih] and Cj = [sj,tj, Ij] 
of parallel calls, with 0 < x’^,x* < 1. We now reroute them in such a way that 
only one of them remains split. Two cases may arise: 

1. x\+ljX* /Ih < 1. In this case we set Xh = x\+ljX*/lh and Xj = 0. Consider now 
a clockwise directed arc {v,v + 1) belonging to the interval [sh,th] H [sj,tj] and 
let A*, be its loads respectively before and after rerouting. It is easily verified 
that Ay = Ihx'f + ljX* = A*. If instead (x-l-1, u) is any counter clockwise directed 
arc in [th,Sh] H [tj,Sj], it is easily seen that, if B* and By respectively denote 
its loads first and after rerouting, we have By = {1 — x"f)lh + (1 — x*)lj = L*. 
If Sh yf Sj or th yf tj, it also holds that arcs not belonging to [sh,th] H [sj,tj] or 
[th, Sh] n [tj, Sj] have the same or reduced loads. 

2. -I- IjXj/lh > I. In this case we set Xh = I and Xj = x* -I- lh/lj{x'f — I). 
The analysis of this case proceeds exactly as in the previous one. Again we have 



210 



Luca Becchetti et al. 



Ay = A* for any clockwise directed arc belonging to interval [sh,th] H [sj,tj] 
and for any counter clockwise directed arc belonging to interval [th, s?i] n \tj, s^], 
while the load on any other arc, if any, doesn’t increase or decrease. As before, 
a similar argument holds for counter clockwise directed arcs. 

At the end of this procedure we have a new solution x whose load L{S) is 
at most L*{S) and such that no two split demands are parallel. Without loss 
of generality we may assume that calls in C are ordered in such a way that 
{Cl, . . . , Cq} is the set of split calls for solution x. We are now left to route split 
calls. Given any clockwise (counter clockwise) directed arc («,«+ 1) ((w + 
let Fy C {Cl, . . . , Cq} be the set of all split calls that are parallel to {v, u + 1) 
{{v + Observe that calls in Fy represent an interval in the ordered set 

{Cl, . . . , Cq}; in the following, we shall denote Fy as an interval [iy,jy], where iy 
and jy respectively are the first and last index of calls in {Ci, . . . ,Cq} that 
are parallel to {v,v + 1) (or {v + l,u)). Also observe that Fy is exactly the set 
of all and only the calls whose rerouting can potentially increase the load on 
{v,v + 1) (or {v + l,w)). Let Ay (respectively By) denote the load resulting 
on (ujU + 1) (respectively {v + l,u)) after routing calls that are split in x and 
let X = {ii, . . . ,Xn} be the corresponding integer solution. Finally, let L{S) = 
max{max„ max„ By}). We have: 

Ay ~ Ay -t ^ Xfi)^ 

helivjv] 



By=By+ ^ lh[{l - Xh) - {1 - Xh)] = By + ^ IhiX^-Xh)- 

h^[iv,jv] h^[iv,jv] 



We are now ready to route calls that are split in x. For j = 1, ... ,q, Cj is routed 
as follows: 



Xj = 



1 if - IjXj + Yj’hJl - Xi) 
0 otherwise 



As a consequence, if 
Y)h=l ^h{xh - Xh) > 

j € [f j) ) jv] ■ 



Xj = 1 then Y)h=i ^h{xh ~ 
—ljl2. In both cases X)l=i 



Xh) < lj/2, while if Xj = 0 then 
lh{xh - Xh) e for any 



We now show that, for each clockwise directed arc {v,v + 1), Ay — Ay < 
( 3 / 2 )lmax, where Imax = Ih. Since we are considering light calls, 

we have Imax < (cL)/3. This implies Ay — Ay < (eL)/2. In particular, given any 
clockwise directed arc (v,v + 1), two cases may arise: 



1- iv ^ jv In this case we bound Ay — Ay = 

jv iy — 1 

^ ^ ^h') ^ ^ 

h=l h=l 



J2h=iy ^h(xh - Xh) as follows: 
Xh) (“^) = 



2. iy > jy. In this case we have: 

q iu — 1 jy 2 

Av x4.y - ^ ^ ^ ^h') H" ^ ^ ^h') ^ 2^'^'' 

h=l h=l h=l 




Approximation Algorithms for Routing and Call Scheduling 211 



Since < ^Imax, it follows that Ay — < {3/2)lmax- Since we are only 

considering light calls (heavy calls have already been routed) and recalling the 
definition of light call, Ay — Ay < {eL)l2. In the same way we can prove that 
By — By. Since L{S) — L*{S) < max„{max{A^ — By — By}} the thesis follows. 
□ 

This is sufficient to prove the main result of this section. In fact, if Sqpt 
denotes the set of heavy calls that are routed long-way in the optimal integer 
solution, we have L{S) < €L/2+L*{Sopt) where, by lemma 4, eL < 2L*{Sopt)- 
This implies that L{S) < (1 + €)L*{Sopt)', since L*{Sopt) is a lower bound to 
the optimal integer solution the following theorem holds 

Theorem 4. The solution provided by algorithm WDRR-PAS(e) has value at 
most (1 -|- e)OPT. For any fixed e > 0, the time complexity of the algorithm is 
polynomial. 

4.2 An Approximate Algorithm for ring-MIN-CS 

When e > 0, Algorithm WDRR-PAS(e) finds a (1 -I- e)-approximate algorithm 
for routing with minimum load. We now turn to the scheduling phase. In the 
following T* denotes the makespan of an optimal solution to MIN-CS. 

A simple idea is as follows: i) solve min-WDRR within (1 -I- e/2) the optimum, 
a) cut the ring at any link, say e. Hi) solve two instances of min-CHAIN-CS with 
calls that do not use link e (in any direction), iv) when the last call scheduled 
in the previous step is completed schedule calls that use link e using a (1 -I- e/2) 
approximate partitioning algorithm. 

It seems reasonable to prove that the above simple algorithm gives an approx- 
imation ratio of 6 -|- e. However this is not necessarily true because the optimum 
scheduling algorithm might use a completely different routing of calls (even one 
that is not e close to the routing with minimum load) . It is possible to prove the 
following weaker result, whose proof is omitted here. In the full paper we will 
give a counterexample showing that, using the routing obtained in the previous 
subsection one is unlikely to prove anything better than 11 -|- e. 

Theorem 5. For any fixed e > 0, there exists a polynomial-time algorithm that 
finds a routing and a schedule for RING-MIN-CS having makespan < (12 -|- e)T* . 

We conclude this section by observing that the same result holds also in the 
case of undirected rings. In this case we will use the polynomial time approxi- 
mation scheme proposed in [14] to compute the routing of calls. Details will be 
given in the full paper. 

5 Conclusions 

In this paper we have considered the Call Scheduling problem in all-optical, 
networks. We have proposed approximation algorithms with constant approxi- 
mation ratio for both chain and ring networks. As a side result, we have also 



212 



Luca Becchetti et al. 



proposed a polynomial-time approximation scheme for the problem of routing 
weighted calls in a ring network with the aim of minimizing the maximum load, 
a problem which has itself a practical relevance in SONET networks [18,14,19]. 

It would be interesting to consider different network topologies and to con- 
sider the problem whith release dates and different objective functions. Also the 
on-line version of the problems deserves future investigation. 



References 

1. R.K. Ahuja, T.L. Magnanti, and J.B. Orlin. Network flows. Prentice-Hall, 1993. 
208 

2. B. Awerbuch, Y. Azar, A. Fiat, S. Leonardi, and A. Rosen. On-line competitive 
algorithms for call admission in optical networks. In Proc. of the Annual 
European Symposium on Algorithms, 1996. 202 

3. Y. Bartal, A. Fiat, and S. Leonardi. Lower bounds for on-line graph problems 
with application to on-line circuit and optical-routing. In Proc. of the 28th Annual 
Symposium on the Theory of Computing, 1996. 202, 202 

4. Y. Bartal and S. Leonardi. On-line routing in all-optical networks. In Proc. of the 
24th International Colloquium on Automata, Languages and Programming, 1997. 
202 

5. L. Becchetti. Effleient Resource Management in High Bandwidth Networks. 
PhD thesis, Dipartimento di Informatica e Sistemistica, University of Rome ”La 
Sapienza”, 1998. 203 

6. C. Brackett. Dense Wavelength Division Multiplexing Networks: Principles and 
Applications. IEEE Journal Seleeted Areas in Comm., 8, 1990. 201 

7. T. Erlebach and K. Jansen. Scheduling Virtual Connections in Fast Networks. In 
Proe. of the 4th Parallel Systems and Algorithms Workshop PASA ’96, 1996. 202 

8. T. Erlebach and K. Jansen. Call Scheduling in Trees, Rings and Meshes. In Proc. 
of the 30th Hawaii International Conference on System Sciences, 1997. 202 

9. M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the 
Theory of NP-Completeness. W. H. Freeman, San Francisco, 1979. 204 

10. J. Gergov. Approximation algorithms for dynamic storage allocation. In Proc. of 
the 4th Annual European Symposium on Algorithms, 1996. 204, 205, 205, 207 

11. J. Gergov. Algorithms for Compile-Time Memory Optimization. Private commu- 
nication, 1998. 202, 204, 204 

12. R. L. Graham. Bounds for multiprocessing timing anomalies. SIAM Journal of 
Applied Math. , 17, 1969. 206 

13. P. E. Green. Fiber-optic Communication Networks. Prentice-Hall, 1992. 201, 201, 
201 

14. S. Khanna. A Polynomial Time Approximation Scheme for the SONET Ring 
Loading Problem. Bell Labs Teeh. J., Spring, 1997. 202, 203, 208, 211, 212 

15. H. A. Kierstead and W. T. Trotter. An extremal problem in recursive combina- 
torics. Congressus Numerantium, 33, 1981. 202 

16. T. Leighton, B. Maggs, and S.Rao. Packet routing and jobshop scheduling in 
O(congestion-l-dilation) steps. Combinatorica, 14, 1994. 202 

17. P. Raghavan and E. Upfal. Efficient Routing in All-Optical Networks. In Proc. of 
the 26th Annual Symposium on the Theory of Computing, 1994. 201, 202 

18. A. Schrijver, P. Seymour, and P. Winkler. The Ring Loading Problem. SIAM J. 
on Discrete Math., 11, 1998. 202, 202, 202, 203, 212 



Approximation Algorithms for Routing and Call Scheduling 213 



19. G. Wilfong and P. Winkler. Ring Routing and Wavelength Translation. In Proc. 
of the European Symposium on Algorithms, pages 333-341, 1998. 202, 203, 208, 

209, 209, 212 



A Randomized Algorithm for Flow Shop 
Scheduling 



Naveen Garg^, Sachin Jain^, and Chaitanya Swamy^ 



2 



^ Department of Computer Science, Indian Institute of Technology, 
Hauz Khas, New Delhi - 110016, India 
naveenOcse . iitd. ernet . in 

Department of Computer Sciences, The University of Texas at Austin, 
Austin, TX 78712-1188, USA 



sachinScs .utexas . edu 

^ Department of Computer Science, Cornell University, 
Ithaca, NY 14853, USA 
swamyOcs . Cornell . edu 



Abstract. Shop scheduling problems are known to be notoriously in- 
tractable, both in theory and practice. In this paper we give a randomized 
approximation algorithm for flow shop scheduling where the number of 
machines is part of the input problem. Our algorithm has a multiplicative 
factor of 2(1 -|- 5) and an additive term of 0(mln(m -|- n)pmax)/5^)- 



1 Introduction 

Shop scheduling has been studied extensively in many varieties. The basic shop 
scheduling model consists of machines and jobs each of which consists of a set 
of operations. Each operation has an associated machine on which it has to be 
processed for a given length of time. The processing times of operations of a 
job cannot overlap. Each machine can process at most one operation at a given 
time. We assume that there are m machines and n jobs. The processing time(of 
an operation) of job j on machine i is denoted by pij and Pmax = maxp^ . We 
will use the terms job(operation) size and processing time interchangeably. 

The three well-studied models are the open shop, flow shop and job shop 
problems. In an open shop problem, the operations of a job can be performed 
in any order; in a job shop, they must be processed in a specific, job-dependent 
order. A flow shop is a special case of job shop in which each job has exactly m 
operations ~ one per machine, and the order in which they must be processed is 
same for all the jobs. The problem is to minimize the makespan, ie. the overall 
length, of the schedule with the above constraints. 

All the above problems are strongly NP-Hard in their most general form. For 
job shops, extremely restricted versions are also strongly NP-Hard; for example 
when there are two machines and the all operations have processing times of 
one or two time units. For the flow shop problem, the case when there are more 
than two machines is strongly NP-Hard, although the two machine version is 
polynomially solvable [3]. The open shop problem is weakly NP-Hard when the 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 213—218, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



214 Naveen Garg et al. 



number of machines is fixed(but arbitrary) and its relation to being strongly 
NP-Hard is open. For two machines there exists a polynomial algorithm for 
open shops. 

As far as approximability of these models is concerned, Williamson et al. [10] 
proved a lower bound 5/4 for the problems in their most general form. For 
the general open shop problem a greedy heuristic is a 2-approximation algo- 
rithm. In the case of job shop Shmoys, Stein and Wein [8] give a randomized 
0{log^(mfx)/loglog(rnfx)) approximation algorithm where ^ is the maximum 
number of operations per job. This bound was slightly improved by Goldberg, 
Paterson, Srinivasan and Sweedyk [1]. Schmidt, Siegel and Srinivasan [4] give a 
deterministic log^(m/r) / loglog(m/r)) approximation algorithm. For flow shop, an 
algorithm by Sevast’janov [5] gives an additive approximation of m(m — l)pmax- 
When m is not fixed these are the best results known. For fixed m we have (1 + e) 
polynomial approximation schemes for all the three problems. An approximation 
scheme for flow shop was given by Hall [2]. Recently an approximation scheme 
for the open shop problem was given by Sevast’janov and Woeginger [6] while 
Sviridenko, Jansen and Solis-Oba [9] give one for job shops. 



Our Contribution 

In this paper we present a randomized approximation algorithm for flow shop 
scheduling when the number of machines is not fixed. Our algorithm is based on 
the rounding of the solution of an LP formulation of the flow shop problem. The 
LP formulation imposes some additional constraints which makes the rounding 
scheme possible. The makespan returned by our algorithm is within 2(l-|-i5) times 
the optimal makespan and has an additive term of 0(mln(m-|-n)pmax)/<5^). This 
shows a tradeoff between the additive and multiplicative factors; if n is bounded 
by some polynomial in m the additive factor we obtain is better than the one in 
the Sevast’janov algorithm, and the multiplicative factor is better than that in 
the algorithm by Shmoys et al. 

The remaining part of this paper is organized as follows. In Section 2, we 
discuss the new slotting constraints imposed by us. Section 3 gives a integral 
multicommodity flow formulation of the problem and Section 4 deals with the 
randomized algorithm and its analysis. 



2 Slotting Constraints 

It is no loss of generality to assume that all operations have size at least 1. Let 
Pmax be the largest operation size. The machines are numbered in the order in 
which the operations of each job are to be processed. 

We divide time into slots of size s, s > 2pma.x- For our randomized rounding 
scheme to work we require that the slots be independent of each other. By this 
we mean that the order in which operations are scheduled on a machine in any 
time-slot is independent of the order of the operations in other time-slots and 
on other machines. To ensure this we impose the restriction that no operation 



A Randomized Algorithm for Flow Shop Scheduling 215 



straddles a slot boundary and that no job moves from one machine to another 
in the middle of a slot. The second condition is equivalent to saying that the 
operations of a job are performed in distinct slots. Thus a job’s operation on 
the i*’*' machine can only start if the operation on the (i-1)*'' machine has been 
completed by the end of the previous slot. 

Consider a flow shop schedule with minimum makespan, OPT. We now 
show how to modify the schedule to satisfy the slotting constraints. First di- 
vide time into slots of size s — Pmax- Since there could be operations straddling 
slot-boundaries, we insert gaps of duration Pmax after each slot. Operations start- 
ing in a slot and going over to the next now finish in these gaps. Finally we merge 
each gap with the slot just before it. This yields slots of size s and each operation 
now finishes in the slot in which it starts. Since OPT was the makespan of the 
original schedule the makespan of this modified schedule is s ■ OPT/{s — Pmax)- 
Next we shift all operations on the second machine by s, on the third machine by 
2s, . . ., on the m**' machine by (m— l)s. This increases the makespan by (m— l)s 
and gives a schedule which satisfies the restriction that all operations of a job are 
performed in distinct time-slots. Thus we have obtained a schedule which satis- 
fies the slotting constraints and has makespan at most — - — OPT + (m — l)s. 
Note that — - — is at most 2 for s = 2pmax- 

3 An Integral Multicommodity Flow Formulation 

In this section we obtain an approximation to the flow shop scheduling problem 
with the slotting restriction. We begin by “guessing” the number of time-slots 
required by the schedule. Construct a directed graph G = (V,E) which has a 
vertex for each (time-slot, machine) pair. There is an edge directed from vertex 
u = (a, i) to vertex v = (b,j) if j = z -I- 1 and a < b. For each job j we have two 
vertices Sj and tj . Sj has edges to all vertices corresponding to the first machine 
and tj has edges from all vertices corresponding to the last machine. 

With this graph we associate a multicommodity flow instance; there is one 
commodity associated with each job j and Sj , tj are the source and sink for this 
commodity. The flow of each commodity should be conserved: the total flow of 
a commodity entering a vertex (other than the source/sink of that commodity) 
is equal to the flow of that commodity leaving the vertex. We wish to route one 
unit of each commodity subject to the following throughput constraints on the 
vertices. Consider a vertex v = (a, i) and let Xyj be the flow of commodity j 
through V. Then 

^ ^ ^v,jPi,j ^ ^ 

j 

Note that an integral multicommodity flow corresponds to a flow shop sched- 
ule satisfying the slotting restrictions. The feasibility of a multicommodity flow 
instance can be determined in polynomial time by formulating it as a linear 
program [7]. Infeasibilty of the multicommodity flow instance implies that our 
guess on the makespan is too small. When this happens we increase our guess on 
the number of time slots. Let k be the smallest number of time slots for which 



216 Naveen Garg et al. 



the multicommodity flow instance is feasible and let F be the corresponding 
flow. If T denotes the minimum makespan of a flow shop schedule satisfying the 
slotting constraints then T > {k — l)s. 

4 The Algorithm and Its Analysis 

is a (fractional) multicommodity flow which routes one unit of each commodity 
while respecting the throughput constraints on the vertices. Flow-theory says 
that the flow of any commodity can be viewed as a collection of at most \E\ 
paths. With each path we associate a weight which is just the flow along that 
path. Hence the total weight of all paths corresponding to commodity j is one. 

The randomized algorithm picks exactly one path for each commodity with 
the probability of picking a path equal to its weight. This collection of paths, 
one for each commodity, gives an integral multicommodity flow which (possibly) 
violates the throughput constraints. The integral multicommodity flow in turn 
defines a flow shop schedule which satisfies the slotting constraints but which 
might be infeasible as the total processing time of operations schedule on a 
machine in a specific time-slot might exceed s. 

Consider vertex v = (a, i). Let Xj be a random variable which is 1 if job j is 
scheduled on machine i in slot a and 0 otherwise. Let X be a random variable 
defined as 

X = 'y^pijXj 

3 



Claim. E[X\ < s. 

Proof. The probability that Xj is 1 equals x^j. Hence, E[Xj] = Xyj. By lin- 
earity of expectations it follows that E[X] = '^jPijE[Xj] = ^jPijXyj. The 
throughput constraint on vertex v implies '^jPijXvj < s from which the claim 
follows. 



Since the random variable X is a linear combination of n independent Poisson 
trials Xx,X 2 , . . . ,Xn such that = 1] = Xyj, using Chernoff bounds we 

obtain 



Pr[X > (l-h(5)E[W]] < 



nE[X]/p. 



(1 + ,5)(i+^) 



Observe that a trivial flow shop schedule satisfying the slotting constraints 
can be obtained by assigning job j to slot j + i — 1 on machine f, 1 < j < n 
and 1 < z < TO. This schedule has makespan (n -I- to — l)s and hence the 
number of vertices in the graph (excluding the source and sink vertices) is at 
most m{n + m — 1). 

Let c = ^ and choose s = max{2, log^[2TO(n + m — l)]}pmax- From 

the above argument it follows that Pr[X > (1 -I- i5)s] < 2 m(n+m-i) • Hence the 
probability that for some machine and some time-slot the total processing time 
of operations scheduled on this machine in this time-slot is more than (1 -|- i5)s 




A Randomized Algorithm for Flow Shop Scheduling 217 



is at most 1/2. Equivalently, with probability at least 1/2 the processing time 
in every slot is less than (1 + 5)s. Expanding each slot of size s to a slot of size 
(1 + S)s then gives us a flow shop schedule of makespan (1 + S)ks. 

Recall that any flow shop schedule satisfying the slotting restriction has 
makespan at least (fc— l)s. Further, a flow shop schedule of makespan OPT yields 
a schedule satisfying the slotting restrictions and having makespan — OPT+ 
(m — l)s. This implies that 

{k-l)s< OPT +{m- l)s 

■S Pmax 

Hence with probability at least 1/2 the randomized rounding procedure gives us 
a feasible schedule whose makespan is bounded by 

{1+S)ks< OPT + (1 + S)ms 

■S Pmax 

where s = max{2, logg[2m(n + m — l)]}pmax, c = ^ /j and (5 is a positive 

constant chosen so that c > 1. 

Theorem 4.1. There exists a polynomial time randomized algorithm for flow 
shop scheduling which with probability at least 1/2 finds a schedule with makespan 
at most 

2(1 + 5) OPT + m(l + (5)pinax log/2m{n + to — 1)] 
where c = ^ . 

e° 



References 

1. Leslie A. Goldberg, Mike Paterson, Aravind Srinivasan, and Elizabeth Sweedyk. 
Better approximation guarantees for job shop scheduling. In Proc. 8th ACM-SIAM 
Symp. on Discrete Algorithms(SODA), pages 599-608, 1997. 214 

2. Leslie A. Hall. Approximability of flow shop scheduling. In Proe. 36th IEEE Annual 
Symp. on Foundations of Computer Science, pages 82-91, 1995. 214 

3. S.M. Johnson. Optimal two- and three- stage procution schedules with setup times 
included. Naval Research Logistics Quarterly, 1:61-68, 1954. 213 

4. J.P. Schmidt, A. Siegel, and A. Srinivasan. Chernoff-holding bounds for appli- 
cations with limited independence. In Proc. 4th ACM-SIAM Symp. on Discrete 
Algorithms(SODA), pages 331-340, 1993. 214 

5. S. Sevast’janov. Bounding algorithm for the routing problem with arbitrary paths 
and alternative servers. Cybernetics, 22:773-780, 1986. 214 

6. S. Sevast’janov and G. Woeginger. Makespan minimization in open shops : A poly- 
nomial time approximation scheme. Mathematical Programming, 82(1-2), 1998. 
214 

7. D.B. Shmoys. Approximation algorithms for XP -Hard problems, chapter Cut prob- 
lems and their applications to divide and conquer, pages 192-234. PWS, 1997. 215 

8. D.B. Shmoys, C. Stein, and J. Wein. Improved approximation algorithms for shop 
scheduling problems. SIAM Journal of Computing, 23:617-632, 1994. 214 



218 Naveen Garg et al. 



9. Maxim Sviridenko, K. Jansen, and R. Solis-Oba. Makespan minimization in job 
shops : A polynomial time approxmiation scheme. In Proc. 31st Annual ACM 
Symp. on Theory of Computing, pages 394-399, 1999. 214 
10. D.P. Williamson, L.A. Hall, J.A. Hoogeveen, C.A.J. Hurkens, J.K. Lenstra, and 
S.V. Sevast’janov. Short shop schedules. Operations Research, 45:288-294, 1997. 
214 



Synthesizing Distributed Transition Systems 
from Global Specifications* 



Ilaria Castellani^, Madhavan Mukund^, and P. S. Thiagarajan^ 

^ INRIA Sophia Antipolis, 

2004 route des Lucioles, B.P. 93, 06092 Sophia Antipolis Cedex, France 
Ilaria.Castellani@sophia. inria.fr 
^ Chennai Mathematical Institute, 

92 G.N. Chetty Road, Chennai 600 017, India 
{madhavan, pst}@smi . ernet . in 



Abstract. We study the problem of synthesizing distributed implemen- 
tations from global specifications. In particular, we characterize when a 
global transition system can be implemented as a synchronized product 
of local transition systems. Our work extends a number of previous stud- 
ies in this area which have tended to make strong assumptions about the 
specification — either in terms of determinacy or in terms of information 
concerning concurrency. 

We also examine the more difficult problem where the correctness of 
the implementation in relation to the specification is stated in terms of 
bisimulation rather than isomorphism. As an important first step, we 
show how the synthesis problem can be solved in this setting when the 
implementation is required to be deterministic. 



1 Introduction 

Designing distributed systems has always been a challenging task. Interactions 
between the processes can introduce subtle errors in the system’s overall be- 
haviour which may pass undetected even after rigorous testing. A fruitful ap- 
proach in recent years has been to specify the behaviour of the overall system in 
a global manner and then automatically synthesize a distributed implementation 
from the specification. 

The question of identifying when a sequential specification has an implemen- 
tation in terms of a desired distributed architecture was first raised in the con- 
text of Petri nets. Ehrenfeucht and Rozenberg [ER90] introduced the concept 
of regions to describe how to associate places of nets with states of a transi- 
tion system. In [NRT92], Nielsen, Rozenberg and Thiagarajan use regions to 
characterize the class of transition systems which arise from elementary net sys- 
tems. Subsequently, several authors have extended this characterization to larger 
classes of nets (for a sample of the literature, see [BD98,Muk92,WN95]). 

* This work has been sponsored by IFCPAR Project 1502-1. This work has also been 
supported in part by BRIGS (Basic Research in Gomputer Science, Gentre of the 
Danish National Research Foundation), Aarhus University, Denmark. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 219—231, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



220 



Ilaria Castellani et al. 



Here, we focus on product transition systems — networks of transition sys- 
tems which coordinate by synchronizing on common actions [Arn94] . This model 
comes with a natural notion of component and induced notions of concurrency 
and causality. It has a well-understood theory, at least in the linear-time set- 
ting [Thi95] . This model is also the basis for system descriptions in a number of 
model-checking tools [Kur94,Hol97]. 

We establish two main sets of results in this paper. First, we characterize 
when an arbitrary transition system is isomorphic to a product transition sys- 
tem with a specified distribution of actions. Our characterization is effective — 
for finite-state specifications, we can synthesize a finite-state implementation. 
We then show how to obtain implementations when concurrency is specified in 
terms of an abstract independence relation, in the sense of Mazurkiewicz trace 
theory [Maz89] . We also present realizability relationships between product tran- 
sition systems in terms of a natural preorder over the distribution of actions 
across agents. Our result subsumes the work of Morin [Mor98] on synthesizing 
product systems from deterministic specifications. 

Our second result deals with the situation when we have global specifications 
which are behaviourally equivalent to, but not necessarily isomorphic to, product 
systems. The notion of behavioural equivalence which we use is strong bisimu- 
lation [Mil89]. The synthesis problem here is to implement a global transition 
system TS* as a product transition system T S such that T S and T S are bisimilar 
to each other. We show how to solve this problem when the implementation is 
deterministic. Notice that the specification itself may be nondeterministic. Since 
distributed systems implemented in hardware, such as digital controllers, are de- 
terministic, the determinacy assumption is a natural one. Solving the synthesis 
problem modulo bisimulation in the general case where the implementation may 
be nondeterministic appears to be hard. 

The problem of expressing a global transition system as a product of com- 
ponent transition systems modulo bisimilarity has also been investigated in the 
context of process algebras in [Mol89,MM93]. In [GM92], Groote and Moller 
examine the use of decomposition techniques for the verification of parallel sys- 
tems. These results are established in the context of transition systems which 
are generated using process algebra expressions. Generalizing these results to 
arbitrary, unstructured, finite-state systems appears hard. 

The paper is organized as follows. In the next section we formally introduce 
product transition systems and formulate the synthesis problem. In Section 3, 
we characterize the class of transition systems which are isomorphic to product 
transition systems. The subsequent section extends these results to the context 
where the distributed implementation is described using an independence rela- 
tion. Next, we show that deterministic systems admit canonical minimal imple- 
mentations. In Section 6, we present our second main result, characterizing the 
class of transition systems which are bisimilar to deterministic product systems. 



Synthesizing Distributed Transition Systems from Global Specifications 221 



2 The Synthesis Problem for Product Transition Systems 

Labelled transition systems provide a general framework for modelling comput- 
ing systems. A labelled transition system is defined as follows. 

Definition 2.1. Let S be a finite nonempty set of actions. A labelled transition 
system over E is a structure TS = where Q is a set of states, 

Qin & Q is the initial state and -^C Q x E x Q is the transition relation. 

We abbreviate a transition sequence of the form qo qi - ■ ■ q-a as 
9o Qn- In every transition system TS = (Q,^,<7in) which we encounter, 

we assume that each state in Q is reachable from the initial state — that is, for 
each q £ Q there exists a transition sequence qtn = qo <Zn = 9- 

A large class of distributed systems can be fruitfully modelled as networks of 
local transition systems whose moves are globally synchronized through common 
actions. To formalize this, we begin with the notion of a distributed alphabet. 

Definition 2.2. ^^distributed alphabet over E , or a distribution of E, is a tuple 
of nonempty sets E = {Ei , . . . , Ek) such that Ui<i<fe action 

a £ E, the locations of a are given by the set loc^{a) = {i \ a £ E^}. If E is 
clear from the context, we write just loc(a) to denote loc^{a). 

We consider two distributions to be the same if they differ only in the order of 
their components. 

Henceforth, for any natural number k, [l../c] denotes the set {1, 2, . . . , k}. 

Definition 2.3. Let {Ei , . . . , Ek) be a distribution of E. For each i £ [1..A:], let 
TSi = (Qi, qt^) be a transition system over Ei. The product (TSi || ••• || 
TSk) is the transition system TS = (Q,^,qin) over E = Ui<i<fc where: 

- din = (qf^,...,qt). 

— Q C (Qi X ■ ■ ■ X Qk) and Q x E x Q are defined inductively by: 

* din £ Q- 

• Let q £ Q and a £ E. For i e [1..A:], let q\i] denote the i*^ component 
of d- If for each i £ loc(a), TSi has a transition q[i] -^i q), then 
q -Ia q' and q' £ Q where q'[i] = q[ for i £ loc(a) and q'[j] = q[j] for 
j i loc(a). 

We often abbreviate the product (TSi || • • • || TSk) by ||ig[i..fc] TSi. 

The synthesis problem 

The synthesis problem can now be formulated as follows. If TS" = 
(Q,^,qin) is a transition system over E, and E = (Ei, . . . , Ek) is a distri- 
bution of E, does there exist a A-implementation of TS — that is, a tuple of 
transition systems (TSi,. . . ,TSk) such that TSi is a, transition system over Ei 
and the product ||ig[i..fc] TSi is isomorphic to TS? 




222 



Ilaria Castellani et al. 



Example 2.4-. Let S = ({a, c}, {6, c}) be a distribution of {a., b, c}. The first tran- 
sition system below is E -implementable — using expressions in the style of process 
algebra, we can write the product as (a + c) || (6c + c). Similarly, the second tran- 
sition system may be implemented as (ac+ c) || (6c + c). 




On the other hand, the system on the right is not E -implementable. Intuitively, 
the argument is as follows. If it were implementable in two components with 
alphabets {a, c} and {6, c}, c would be enabled at the initial state in both com- 
ponents. But c can also occur after both actions a and b have occurred. So, c is 
possible after a in the first component and after b in the second component. Thus, 
there are two c transitions in both components and the product should exhibit all 
their combinations, giving rise to the system in the centre. We can formalize this 
argument once we have proved the results in the next section. 

3 A Characterization of Implementable Systems 

We now characterize Z'-implementable systems. In this section, unless otherwise 
specified, we assume that E = {E \, . . . , E}f), with E = Uie[i k] 

The basic idea is to label each state of the given system by a /c-tuple of 
local states (corresponding to a global state of a product system) such that 
the labelling function satisfies some consistency conditions. We formulate this 
labelling function in terms of local equivalence relations on the states of the 
original system — for each i G [l..fc], if two states q\ and q 2 of the original system 
are i-equi valent, the interpretation is that the global states assigned to qi and 52 
by the labelling function agree on the component. Our technique is similar 
to the one developed independently by Morin for the more restrictive class of 
deterministic transition system specifications [Mor98] . 

Theorem 3.1. A transition system TS = {Q,—^,qin) is E -implementable with 
respect to a distribution E = {Ei , . . . , Ek) if and only if for each i G [l..k] there 
exists an equivalence relation =i ^ {Q x Q) such that the following conditions 
are satisfied: 

(i) If q — ^ q' and a ^ Ei, then q =i q' . 

(a) If q =i q' for every i, then q = q' . 

(Hi) Let q G Q and a G E. If for each i G loc{a), there exist Si,s'i G Q such 
that Si =i q, and Si — ^ s', then for each choice of such Si ’s and s' ’s there 
exists q' & Q such that q — ^ q' and for each i G loc{a), q' =i s'. 



Synthesizing Distributed Transition Systems from Global Specifications 223 



Proof. (=1>) : Suppose ||ig[i,,fe] TSi is a i7-implementation of TS. We must ex- 
hibit k equivalence relations {=i}ie[i,.fe], such that conditions (i) — (Hi) are sat- 
isfied. Assume, without loss of generality, that TS is not just isomorphic to 
llie[i..fc] TSi but is in fact equal to ||ie[i..fc] TS^. 

For i G let TSi = {Qi, 9in)- We then have Q C [Qi x • • • x Qk) and 

«n = (?in) • ■ • ) 9in)- Define =iC (Q x Q) as follows: q =i q' iff q[i] = q'[i]. 

Since TS' is a product transition system, it is clear that conditions (i) and 
(ii) are satisfied. To establish condition (in), Hx q G Q and a G S. Suppose that 
for each i G loc{a) there is a transition Si — ^ s' such that Si =i q. Clearly, for 
each i G loc{a), Si DF. implies Si[t] —^i Moreover Si[t] = q[i] by the 
definition of =i. Since TS is a, product transition system, this implies q qf 
where q'[i] = s'[z] for i G loc{a) and q'[i] = q[i] otherwise. 

(<1=) : Suppose we are given equivalence relations {=i C (Q x <5)}ig[i,,fe] which 

satisfy conditions (i) — (Hi). For each q G Q and i G [1..A:], let [q]i {s | s =i q}. 
For i G [l..fc], define the transition system TSi = {Qi, ^i, q{^) over Si as follows: 

— Q^ = {[q]z \ qGQ}, with = [qin]i. 

— [q]i — >i [q']i iE a G Si and there exists s — > s' with s =i q and s' =i q' . 

We wish to show that TS is isomorphic to ||ie[i..fc] TSi. Let ||ie[i,,fe] TSi = 
{Q: 9in)- We claim that the required isomorphism is given by the function 

f :Q^Q, where f{q) = ([g]i, . . . , [q]k). 

— We can show that / is well-defined — that is f{q) G Q for each q — by induc- 
tion on the length of the shortest path from qi^ to q. We omit the details. 

— We next establish that / is a bijection. Clearly condition (ii) implies that / 
is injective. To argue that / is onto, let ([si]i, . . . , [sfe]fe) G Q be reachable 
from gin in n steps. We proceed by induction on n. 

• Basis: If n = 0, ([si]i, . . . , [sfc]fc) = gi„ = /(g™). 

• Induction step: 

Let . . . , [rk]k) be reachable from gi„ in n— 1 steps. Consider a 

move . . . , [rk]k) ([si]i, • • ■ , [sfe]fe)- By the induction hypothe- 

sis there exists q G Q such that /(g) = ([g]i, . . . , [g]fc) = 
. . . , [rfe]fe). Now, ([ri]i, . . . , [rfc]fc) ([si]i, . . . , [sfc]fc) implies 
that [ri]i — >i [si]i for each i G loc{a). Hence, for each i G loc(a), 
there exist r',s' such that r' =i Vi, s' =i Si and r' MF. gy con- 
dition (Hi), since g =i ri, for any choice of such r'’s and s'’s there 
exists g' such that g q' and g' =i s' for each i G loc{a). We want 
to show that f{q') = ([si]i, . . . , [sfejfe). For i G loc{a) we already know 
that g' =i s' =i Si. So suppose i ^ loc{a). In this case g — ^ g' implies 
[q']^ = [9]t by condition (i). From [g]* = [ri]i and [ri]i = [s,]i it follows 
that [q']i = [si]i. 

— It is now easy to argue that / is an isomorphism — we omit the details. 




224 



Ilaria Castellani et al. 



An effective synthesis procedure 

Observe that Theorem 3.1 yields an effective synthesis procedure for finite- 
state specifications which is exponential in the size of the original transition 
system and the number of components in the distributed alphabet. The number 
of ways of partitioning a finite-state space using equivalence relations is bounded 
and we can exhaustively check each choice to see if it meets criteria (i)-(iii) in 
the statement of the theorem. 

4 Synthesis and Independence Relations 

An abstract way of enriching a labelled transition system with information about 
concurrency is to equip the underlying alphabet with an independence relation — 
intuitively, this relation specifies which pairs of actions in the system can be 
executed independent of each other. 

Definition 4.1. An independence relation over S is a symmetric, irreflexive 
relation I Q E x S . 

Each distribution E = {E\, . . . , Ek) induces a natural independence rela- 
tion over E — two actions are independent if they are performed at nonover- 
lapping sets of locations across the system. Formally, for a, 6 € E, a b ^ 
loc{a) n loc{b) = 0. The following example shows that different distributions 
may yield the same independence relation. 

Example 4-2. If E = {a,b,c,d} and I = {(a, 6), (6, a)}, then the distributions 
E = {{a,c,d},{b,c,d}) , E' = {{a,c,d},{b,c},{b,d}) and E" = ({a, c}, {a, c?}, 
{b,c},{b,d},{c,d}) all give rise to the independence relation I. 

However for each independence relation / there is a standard distribution induc- 
ing /, whose components are the maximal cliques of the dependency relation 
D = {E X E) — I. In the example above, the standard distribution is the one 
denoted E. Henceforth, we will denote the standard distribution for / by Ej. 

The synthesis problem with respect to an independence relation 

We can phrase the synthesis problem in terms of independence relations as 
follows. Given a transition system TS = (Q,— >,®n) and a nonempty indepen- 
(^nce relation I over E, does there exist an I -implementation of TS, that is a 
if-implementation of TS such that E induces II ^ 

We show that if a transition system admits a 17-implementation then it also 
admits a 47/. -implementation. Thus the synthesis problem with respect to inde- 
pendence relations reduces to the synthesis problem for standard distributions. 

We l^gin by showing that a system that has an implementation oyer a distri- 
bution E also has an implementation over any coarser distribution T, obtained 
by merging some components of E and possibly adding some new ones. 

Definition 4.3. Let E = (47i, . . . , Ek) and T = {Ti , . . . , //) be distributions of 
E. Then E < T if for each i S [l..fc], there exists j € such that Ei C Ej. 
If E < T we say that E is finer than T, or E is coarser than E. 



Synthesizing Distributed Transition Systems from Global Specifications 225 



We then have the following simple observation. 

Proposition 4.4. If E < F then Ip Q I^- 

Note that < is not a preorder in general. In fact E < F means that the maximal 
elements of E are included in those of F. Let us denote by ~ the relation < n >. 
Then, E F just means that E and F have the same maximal elements — in 
general, it does not guarantee that they are identical. However, when restricted 
to distributions “without redundancies”, < becomes a preorder. 

Definition 4.5. A distribution E = {Ei, . . . , Ek) of E is said to be simple if 
for each i,j G [l..fc], i ^ j implies that Ei ^ Ej. 



Proposition 4.6. Let E and F be simple distributions of E . If E F then 
E = T. 

For any independence relation / over E, the associated standard distribution Ej 
is a simple distribution, and is the coarsest distribution inducing I. At the other 
end of the spectrum, we can define the finest distribution inducing / as follows: 

Definition 4.7. Let I be an independence relation over E , and D = (L7x E)—I. 
The distribution Aj over E is defined by: 

= {{a;, y} I {x, y) G D,x y}U {{a:} \ x I y for each y ^ x} 

Proposition 4.8. Let I be an independence relation over E. Then the distri- 
bution Aj is the finest simple distribution over E that induces I . 

A finer distribution can be faithfully implemented by a coarser distribution. 

Lemma 4.9. Let E = {E \, . . . , Ek) and F = {F \, . . . , Ff) be distributions of E 
such that E < F. Then, for each product transition system ||ig[i,,fe] TSi over E, 
there exists an isomorphic product transition system ||ig[i,,£] TSi over F. 

Proof. For each i G [l..fc] let f{i) denote the least index j in [l..£] such that 
Ei C Fj. For each j G define TSj = follows. 

— If j is not in the range of /, then Qj = and q(.^ -^j q(.^ for each 

a G Ej . 

— If j is in the range of /, let f~^{j) = ■ • ■ Am}- Set TSj = (TSi^ || 

•••II TS,J. 

It is then straightforward to verify that ||ig[i,,fe] TSi is isomorphic to ||ig[i..^] 
TSi- We omit the details. 




226 



Ilaria Castellani et al. 



Corollary 4.10. Let TS = (Q,^,®n) be a transition system over E, and S 
and r he two distributions of E sueh that E <F. IfTS is E -implementable it 
is also r -implementable. 

Let I be an independence relation over E. We have already observed that Ej is 
the coarsest distribution of E whose induced independence relation is I. Coupling 
this remark with the preceding corollary, we can now settle the synthesis problem 
with respect to independence relations. 

Corollary 4.11. Let TS = (Q,— >,®n) be a transition system over E, and E 
be a distribution of E indueing the independence relation L. Then ifTS is E- 
implementable itj,s also Ej -implementable. Moreover TS is L -implementable if 
and only if it is Ej -implementable. 

We remark that the converse of Lemma 4.9 is not true — if E < T, it may be 
the case that TS is C-implementable but not If-implementable. Details can be 
found in the full paper [CMT99]. 



5 Canonical Implementations and Determinacy 

A system may have more than one A-implementation for a fixed distribution E. 
For instance, the system 




has two implementations with respect to the distributed alphabet E = ({a, c, d}, 
{b, c, d}), namely ca + c{a + d) || c{b + d) and c(a + d) || cb + c{b + d). 

One question that naturally arises is whether there exists a unique minimal 
or maximal family of equivalence relations {= 1 ,...,=^} on states that makes 
Theorem 3.1 go through. We say that a family {=i, . . . , =fc} is minimal (respec- 
tively, maximal) if there is no other family {=^, . . . ,=fc} with respect to which 
TS is A-implementable with =' C =j (respectively, =i C =') for each i G [l..fc]. 

It turns out that for deterministic systems, we can find unique minimal im- 
plementations. A transition system TS = (Q,— >,®n) is deterministic if q — > q' 
and q — ^ q” imply that q' = q" . 

Theorem 5.1. Suppose TS= (Q, <Zin) is deterministic and E -implementable. 
Then there exists a unique minimal family { =i, . . . , =fc} of equivalence relations 
on states with respect to which TS is E -implementable. 



Synthesizing Distributed Transition Systems from Global Specifications 227 



Proof. Suppose that {=i, . . . , =fc} and , ='j,} are two families which rep- 

resent if-implementations of TS. Let {=i,...,=fe} be the intersection family, 
given hy q =i q' q =i q' A q =' q' . 

By definition, =i C =j and =i C ='. Thus, it suffices to show that 
=fc} represents a if-implementation oiTS. From the definition of the re- 
lations 

{= 1 , . . . , =fc}, it is obvious that both conditions (i) and (ii) of Theorem 3.1 are 
satisfied. Now suppose that q G Q and a G S. For every i G loc{a), let Si, s[ G Q 
be such that Si =i q and Si — ^ s'. This means that for every i G loc(a), 
both Si =i q and Si =' q. Hence by condition (in) there exists q' such that 
q — ^ q' and q' =i s' for every i G loc{a). Similarly, there exists g" such that 
q — ^ g" and g" =t s'i for every i G loc{a). Since TS is deterministic, it must be 
q' = g". Thus, g' = g" is such that g' =i s' for each i G loc{a). 



This result leads us to conjecture that the synthesis problem for deterministic 
systems is much less expensive computationally than the synthesis problem in the 
general case, since it suffices to look for the unique minimal family of equivalence 
relations which describe the implementation. 

We conclude this section by observing that 
a deterministic system may have more than om 
maximal family {= 1 ,...,=^} for which it is E- 
implementable. For instance the system on the 
left has two distinct maximal implementations with 
respect to the distribution ({a, c}, {6, c}), namely 
(fix X. a + cX) II b + cb and a -I- ca |j (fix Y. b+ cY), 
whose components are not even language equivalent. 




6 Synthesis Modulo Bisimulation 

In the course of specifying a system, we may accidentally destroy its inherent 
product structure. This may happen, for example, if we optimize the design 
and eliminate redundant states. In such situations, we would like to be able to 
reconstruct a product transition system from the reduced specification. Since the 
synthesized system will not, in general, be isomorphic to the specification, we 
need a criterion for ensuring that the two systems are behaviourally equivalent. 
We use strong bisimulation [Mil89] for this purpose. 

In general, synthesizing a behaviourally equivalent product implementation 
from a reduced specification appears to be a hard problem. In this section, we 
show how to solve the problem for reduced specifications which can be imple- 
mented as deterministic product transition systems — that is, the global transi- 
tion system generated by the implementation is deterministic. Notice that the 
specification itself may be nondeterministic. Since many distributed systems im- 
plemented in hardware, such as digital controllers, are actually deterministic, 
our characterization yields a synthesis result for a large class of useful systems. 

We begin by recalling the definition of bisimulation. 



228 



Ilaria Castellani et al. 



Definition 6.1. A bisimulation between a pair of transition systems TSi = 
(Qi, ^1, and TS2 = (Q2)^27<Zin) ® relation R C (Qi x Q2) such that: 

— 1/(91,92) G R and 9i — >1 q[, there exists q'2, 92 — >2 92 and (9(,92) G R- 
~ 1/(91,92) G R and 92 —^2 92 , i/iere exists q[, 91 —^1 9( anrf (91,92) G R. 

The synthesis problem modulo bisimilarity 

The synthesis problem modulo bisimilarity can now be formulated as follows. 
If TS” = (Q, 9in) is a transition system over S, and S = {Ui, . . . , Sk) is a 
distribution of S, does there exist a product system ||ig[i..fc] TSi over S such 
that ||ig[i..fe] TSi is bisimilar to TSI 

To settle this question for deterministic implementations, we need to consider 
product languages. 

Languages Let TS = {Q, 9i„) be a transition system over S. The language 

of TS' is the set L(TS) C S* consisting of the labels along all runs of TS. In 
other words, L{TS) = {w | 9in 9, 9 G Q}. 

Notice that L{TS) is always prefix-closed and always contains the empty 
word. Moreover, L{TS) is regular whenever TS is finite. For the rest of this 
section, we assume all transition systems which we encounter are finite. 

Product languages Let L C S* and let S = {Ei , . . . , Ek) be a distribution 
of E. For w G E*, let denote the projection of w onto Ei, obtained by 

erasing all letters in w which do not belong to Ei. 

The language L is a product language over E if for each i e [l..fc] there is a 
language Li C E* such that L = {w \ G Li,i G [l..fc]}. 

For deterministic transition systems, bisimilarity coincides with language 
equivalence. We next show that we can extend this result to get a simple char- 
acterization of transition systems which are bisimilar to deterministic product 
transition systems. We first recall a basic definition. 

Bisimulation quotient Let TS = (Q,^,9in) be a transition system and let 
~TS be the largest bisimulation relation between TS and itself. The relation 
~TS defines an equivalence relation over Q. For q € Q, let [9] denote the ~ts- 
equivalence class containing 9. The bisimulation quotient of TS” is the transition 
system TS/ = (< 5 ,~^, [9in]) where 

— Q = {[9] I 9 G Q}. 

— [9] [q'\ if there exist 91 G [9] and q[ G [9'] such that 91 — > 9^. 

The main result of this section is the following. 

Theorem 6.2. Let TS be a transition system over E and let E be a distribution 
of E.JThe system TS is bisimilar to a deterministic product transition system 
over E iff TS satisfies the following two conditions. 




Synthesizing Distributed Transition Systems from Global Specifications 229 



— The bisimulation quotient TS/ is deterministic. 

— The language LiTS) is a product language over E. 

To prove this theorem, we first recall the following basic connection between 
product languages and product systems [Thi95]. 

Lemma 6.3. L(||ig[i..j] TSi) = {w | wti;, G L{TSi),i G [1..A:]}. 

We also need the useful fact that a product language is always the product 
of its projections [Thi95]. 

Lemma 6.4. Let L C E* and let E = {E\, . . . ^ Ek) he a distribution of E. 
For i G [1..A:], let Li = {u> |"i;J w G L}. Then, L is a product language iff 
L = {w\ wfi;, e Li,i e [1..A:]}. 

Notice that Lemma 6.4 yields an effective procedure for checking if a finite- 
state transition system accepts a product language over a distribution 
{El, . . . , Ek). For i G [1..A:], construct the finite-state system TSi such that 
L{TSi) = L\s- and then verify that L{TS) = L(||jg[i TSi). 

Next, we state without proof some elementary facts about bisimulations. 

Lemma 6.5. (i) LetTS be a deterministic transition system. Then, TS/,^r^g 

is also deterministic. 

(ii) Let TS\ and TS 2 be deterministic transition systems over E. Lf L{TS{) = 
L{TS 2 ) then TS\ is hisimilar to TS 2 . 

(Hi) Let TSi and TS 2 be bisimilar transition systems. Then TSi/r^.j,g and 
TS 2 /r^TS 2 isomorphic. Further L{TSi) = L{TS 2 ). 

We now prove both parts of Theorem 6.2. 

Lemma 6.6. Suppose that a transition system TS is bisimilar to a determin- 
istic product transition system. Then, TS/r.^.^^ is deterministic and L{TS) is a 
product language. 

Proof. Let TS =||jg[i jj TSi be a deterministic product transition system such 
that TS is bisimilar to TS. ^ 

By Lemma 6.5 (iii), L{TS) = L{TS). Since L{TS) is a product language, it 
follows that L{TS) is a product language. 

To check that TS/r^.^.^ is deterministic, we first observe that TS/r.^,^ is de- 
terministic, by Lemma 6.5 (i). By part (iii) of the same lemma, TS/ r^^rs must 
be isomorphic to TS/r~.jrg- Hence, TS/r-^Tg is also deterministic. 



Lemma 6.7. Let TS be a transition system over E and E he a distribution of 
E, such that TS/ is deterministic and L{TS) is a product language over E. 
Then, TS is bisimilar to a deterministic product transition system over E. 



230 



Ilaria Castellani et al. 



Proof. Let E = {Ei, . . . , Ek). For i € [1..A:], let w G L(TS)j. 

We know that each is a regular prefix-closed language which contains the 
empty word. Thus, we can construct the minimal deterministic finite-state au- 
tomaton Ai = (Qi, — gjjj, Fi) recognizing Li. Since Li contains the empty word, 

G Fi. Consider the restricted transition relation —>'=—!■ j C\{Fi x FI x Fi). It 
is easy to verify that the transition system TSi = {Fi,^'^, ql^) is a deterministic 
transition system such that L{TSi) = Li. 

Consider the product TS =||ig[i..fc] TSi. Lemma 6.3 tell us that L{TS) = 
{w I S Li,i G [l..fc]}. From Lemma 6.4, it follows that L{TS) = L{TS). 

We claim that TS is bisimilar to TS. Consider the quotient TS/n^.j.g. Both 
TS/^.j.g and TS are deterministic and L{TS/r^.j.g) = L{TS) = L{TS). Thus, 
by Lemma 6.5 (ii), it must be the case that is bisimilar to TS. By 

transitivity, TS is also bisimilar to TS. 

Using standard automata theory, we can derive the following result from 
Theorem 6.2. 

Corollary 6.8. Given a finite-state transition system TS = (Q ,— and a 
distributed alphabet E, we can effectively decide whether TS is bisimilar to a 
deterministic product system over E. 

The synthesis problem modulo bisimilarity appears to be quite a bit more 
difficult when the product implementation is permitted to be nondeterminis- 
tic. We have a characterization of the class of systems which are bisimilar to 
nondeterministic product systems [CMT99]. Our characterization is phrased in 
terms of the structure of the execution tree obtained by unfolding the specifica- 
tion. Unfortunately, the execution tree may be infinite, so this characterization 
is not effective, even if the initial specification is finite-state. The difficulty lies 
in bounding the number of transitions in the product implementation which can 
collapse, via the bisimulation, to a single transition in the specification. 



References 

Arn94. A. Arnold: Finite transition systems and semantics of communicating 
sytems, Prentice-Hall (1994). 220 

BD98. E. Badouel and Ph. Darondeau: Theory of Regions. Lectures on Petri nets 
I (Basic Models), LNCS 1491 (1998) 529-588. 219 
CMT99. I. Castellani, M. Mukund and P.S. Thiagarajan: Characterizing decompos- 
able transition systems. Internal Report, Chennai Mathematical Institute 
(1999). 226, 230 

ER90. A. Ehrenfeucht and G. Rozenberg: Partial 2-structures; Part II, State spaces 
of concurrent systems, Acta Inf. 27 (1990) 348-368. 219 
GM92. J.F. Groote and F. Moller: Verification of Parallel Systems via Decomposi- 
tion, Proc. CONCUR’92, LNCS 630 (1992) 62-76. 220 
Hol97. G.J. Holzmann: The model checker SPIN, IEEE Trans, on Software Engi- 
neering, 23 , 5 (1997) 279-295. 220 



Synthesizing Distributed Transition Systems from Global Specifications 231 



Kur94. R.P. Kurshan: Computer-Aided Verification of Coordinating Processes: The 
Automata-Theoretic Approach, Princeton University Press (1994). 220 

Maz89. A. Mazurkiewicz: Basic notions of trace theory, in: J.W. de Bakker, W.- 
P. de Roever, G. Rozenberg (eds.), Linear time, branching time and partial 
order in logics and models for concurrency, LNCS 354 (1989) 285-363. 220 

Mil89. R. Milner: Communication and Concurrency, Prentice-Hall, London (1989). 
220, 227 

Mol89. F. Moller: Axioms for Concurrency, Ph.D. Thesis, University of Edinburgh 
(1989). 220 

MM93. F. Moller and R. Milner: Unique decomposition of processes, Theor. Corn- 
put. Set. 107 (1993) 357-363. 220 

Mor98. R. Morin: Decompositions of asynchronous systems, Proc. CONCUR’98, 
LNCS 1466 (1998) 549-564. 220, 222 

Muk92. M. Mukund: Petri Nets and Step Transition Systems, Int. J. Found. Corn- 
put. Sci. 3, 4 (1992) 443-478. 219 

NRT92. M. Nielsen, G. Rozenberg and P.S. Thiagarajan: Elementary transition sys- 
tems, Theor. Comput. Sci. 96 (1992) 3-33. 219 

Thi95. P.S. Thiagarajan: A trace consistent subset of PTL, Proc. CONCUR’95, 
LNCS 962 (1995) 438-452. 220, 229, 229 

WN95. G. Winskel and M. Nielsen: Models for concurrency, in S. Abramsky, 
D. Gabbay and T.S.E. Maibaum, eds, Handbook of Logic in Computer Sci- 
ence, Vol 4, Oxford (1995) 1-148. 219 



Beyond Region Graphs: 

Symbolic Forward Analysis of Timed Automata 



Supratik Mukhopadhyay and Andreas Podelski 



Max-Planck-Institut fiir Informatik 
Im Stadtwald, 66123 Saarbriicken, Germany 
{ supratik , podelski}@mpi-sb . mpg . de 



Abstract. Theoretical investigations of infinite-state systems have so 
far concentrated on decidability results; in the case of timed automata 
these results are based on region graphs. We investigate the specific 
procedure that is used practically in order to decide verification prob- 
lems, namely symbolic forward analysis. This procedure is possibly non- 
terminating. We present basic concepts and properties that are useful for 
reasoning about sufficient termination conditions, and then derive some 
conditions. The central notions here are constraint transformers asso- 
ciated with sequences of automaton edges and zone trees labeled with 
successor constraints. 



1 Introduction 

A timed automaton [i] models a system whose transitions between finitely many 
control locations depend on the values of clocks. The clocks advance continuously 
over time; they can individually be reset to the value 0. Since the clocks take 
values over reals, the state space of a timed automaton is infinite. 

The theoretical and the practical investigations on timed automata are re- 
cent but already quite extensive (see e.g. [1,7,11,2,4]). Many decidability results 
are obtained by designing algorithms on the region graph, which is a finite quo- 
tient of the infinite state transition graph [1]. Practical experiments showing the 
feasibility of model checking for timed automata, however, employ symbolic for- 
ward analysis. We do not know of any practical tool that constructs the region 
graph. Instead, symbolic model checking is extended directly from the finite to 
the infinite case; logical formulas over reals are used to ‘symbolically’ represent 
infinite sets of tuples of clock values and are manipulated by applying the same 
logical operations that are applied to Boolean formulas in the finite state case. 

If model checking is based on backward analysis (where one iteratively com- 
putes sets of predecessor states), termination is guaranteed [9]. In comparison, 
symbolic forward analysis for timed automata has the theoretical disadvantage 
of possible non-termination. Practically, however, it has the advantage that it 
is amenable to on-the-fiy local model checking and to partial-order reduction 
techniques (see [8] for a discussion of forward vs. backward analysis). 

In symbolic forward analysis applied to the timed automata arising in practi- 
cal applications (see e.g. [11]), the theoretical possibility of non-terminating does 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 232—244, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Beyond Region Graphs: Symbolic Forward Analysis of Timed Automata 233 

not seem to play a role. Existing versions that exclude this possibility (through 
built-in runtime checks [4] or through a static preprocessing step [7]) are not 
used in practice. 

This situation leads us to raising the question whether there exist ‘interesting’ 
sufficient conditions for the termination of symbolic model checking procedures 
for timed automata based on forward analysis. Here, ‘interesting’ means applica- 
ble to a large class of cases in practical applications. The existence of a practically 
relevant class of infinite-state systems for which the practically employed pro- 
cedure is actually an algorithm would be a theoretically satisfying explanation 
of the success of the ongoing practice of using this procedure, and it may guide 
us in designing practically successful verification procedures for other classes of 
infinite-state systems. 

As a first step towards answering the question that we are raising, we build 
a kind of ‘toolbox’ consisting of basic concepts and properties that are useful 
for reasoning about sufficient termination conditions. The central notions here 
are constraint transformers associated with sequences of automaton edges and 
zone trees labeled with successor constraints. The constraint transformer asso- 
ciated with the sequences of edges ei , . . . , e„ of the timed automaton assigns 
a constraint (p another constraint that ‘symbolically’ represents the set of the 
successor states along the edges ei, . . . ,e„ of the states in the set represented 
by (fi. We prove properties for constraint transformers associated with edge se- 
quences of a certain form; these properties are useful in termination proofs as we 
then show. The zone tree is a vehicle that can be used to investigate sufficient 
conditions for termination without having to go into the algorithmic details of 
symbolic forward analysis procedures. It captures the fact that the constraints 
enumerated in a symbolic forward analysis must respect a certain tree order. 

We show how the zone tree can characterize termination of (various versions 
of) symbolic forward analysis. A combinatorial reasoning is then used to derive 
sufficient termination conditions for symbolic forward analysis. We prove that 
symbolic forward analysis terminates for three classes of timed automata. These 
classes are not relevant practically; the goal is merely to demonstrate how the 
presented concepts and properties of the successor constraint function and of 
the zone tree can be employed to prove termination. Termination proofs can be 
quite tedious, as the third case shows; the proof here distinguishes many cases. 

2 The Constraint Transformer cp i— > |ie]((^) 

A timed automaton Li can, for the purpose of reachability analysis, be defined 
as a set £ of guarded commmands e (called edges) of the form below. Here L is 
a variable ranging over the finite set of locations, and x = (si, . . . ,Xn) are the 
variables standing for the clocks and ranging over nonnegative real numbers. As 
usual, the primed version of a variable stands for its value after the transition. 
The ‘time delay’ variable z ranges over nonnegative real numbers. 



e = L = £ A 7e(£c) | L' = A ae{x, x', z). 



234 Supratik Mukhopadhyay and Andreas Podelski 



The guard formula 7e(®) over the variables x is built up from conjuncts of 
the form Xi ^ k where Xi is a clock variable, ~ is a comparison operator (i.e., 
{=, <, <, >, >}) and fc is a natural number. 

The action formula ae{x, x' , z) of e is defined by a subset Resetg of {1, . . . , n} 
(denoting the clocks that are reset); it is of the form 

ae(x,x',z) = x\ = z K x\ = Xi^z. 

i^Resete i^Resetg 

We write 'i/'e for the logical formula corresponding to e (with the free variables x 
and x' ; we replace the guard symbol [ with conjunction). 

tjje{x,x') = L = i A Je{x) A L' = £' A3 z ae{x,x' , z) 

The states of U (called positions) are tuples of the form (£, v) consisting of values 
for the location and for each clock. The position {i, v) can make a time transition 
to any position {£, v + 6) where (5 > 0 is a real number. 

The position (£, v) can make an edge transition (followed by a time transition) 
to the position {£' ,v') using the edge e if the values £ for L, v for x, £' for L' 
and v' for x' define a solution for 4’e- (An edge transition by itself is defined if 
we replace the variable z in the formula for a by the constant 0.) 

We use constraints (p in order to represent certain sets of positions (called 
zones). A constraint is a conjunction of the equality L = £ with a conjunction of 
formulas of the form Xi — ~ c or a;, ~ c where c is an integer (i.e. with a zone 

constraint as used in [4]). We identify solutions of constraints with positions 
{£, v) of the timed automaton. 

We single out the initial constraint ipP that denotes the time successors of 
the initial position (£°,0). 

pP = L = £^^ ^X\>£^,X2 = X\t . . ,Xn = 

A constraint Lp is called time-closed if its set of solutions is closed under time 
transitions. Formally, (f{x) is equivalent to {3x3z{p> A x’l = x\ z A . . . A x’^ = 
Xn + z))\x/x']. For example, the initial constraint is time-closed. In the following, 
we will be interested only in time-closed constraints. 

In the definition below, ip'[x' /x] denotes the constraint obtained from tp' by 
Qf-renaming (replace each a;' by Xi). 

We write e\ Cm for the word w obtained by concatenating the ‘let- 

ters’ Cl, ... , Cm] thus, w is a word over the set of edges £, i.e. w G £* . 

Definition 1 (Constraint Transformer |w]). The constraint transformer 
wrt. to an edge e is the ‘successor constraint function’ |?u] that assigns a con- 
straint p> the constraint 

[eKv’) = {3x{ipApe))[x'/x]. 

The successor constraint function |i/;] wrt. a string w = e\ Cm of length 

m > 0 is the functional composition of the functions wrt. the edges ei, . . . , 6m, 
i.e. |w] = |ei] o . . . o |e^]. 



Beyond Region Graphs: Symbolic Forward Analysis of Timed Automata 235 



Thus, |e](</5) = ^ and |w.e]((p) = |e](|w](<p)). The solutions of |ta]((p) are 
exactly the ( “edge plus time” ) successors of a solution of by taking the sequence 
of transitions via the edges ei, . . . , (in that order). 

We will next consider constraint transformers |w] for strings tc of a certain 
form. In the next definition, the terminology ‘a clock Xi is queried in the edge e’ 
means that Xi is a variable occurring in the guard formula 7 of e; ‘x, is reset 
in e’ means that i G Resetg. 

Definition 2 (Stratified Strings). A string w = e\ Cm of edges is called 

stratified if 

— each clock xi, . . . ,Xn is reset at least once in w, and 

— if Xi is reset in Cj then Xi is not queried in e\, . . . , Cj. 

Proposition 1. The successor constraint function wrt. a stratified string w is 
a constant function over satisfiable constraints (i.e. there exists a unique con- 
straint ifw such that |w](t)) = ipw for all satisfiable constraints Lp). 

Proof. We express the successor constraint of the constraint (p wrt. the stratified 
string w = ei ... Cm equivalently by 

|w](7)) = {3x3x^ . . .3x'^~^3z^ . . .3z"^ {(p f\ fill A . . . A 4>rn))[x / x'^] 

where fik is the formula that we obtain by applying a-renaming to the 
(quantifier-free) conjunction of the guard formula 7e^ (x) and the action formula 
aef.{x,x' , z) for the edge Ck', i.e. 

fik = Aae^{x'‘~^,x'^,z’^). 

Thus, in the formula for Cfc, we rename the clock variable Xi to x^~^ , its primed 
version a;' to x’^, and the ‘time delay’ variable z to z^. 

We identify the variables Xi (applying in ip) with their “0-th renaming” x^ 
(appearing in fii); accordingly we can write x° for the tuple of variables x. 

We will transform . . . 3x'^~^{fiiA. . .Afim) equivalently to a constraint fi 
containing only conjuncts of the form xfi = z’' z™ and of the form z* -I- 

. . . -|- z™ ~ c where I > 0; i.e. fi does not contain any of the variables Xi of tp. 
Thus, we can move the quantifiers 3x inside; formally, 3x{tp A fi) is equivalent 
to {3x(p) A fi. Since tp is satisfiable, the conjunct 3xp is equivalent to true. 
Summarizing, |w](t)) is equivalent to a formula that does not depend on (p, 
which is the statement to be shown. 

The variable x^ (the “fc-th renaming of the f-th clock variable”) occurs in the 
action formula of fik, either in the form x^ = z^ or in the form x^ = x^~^ 3- z^, 
and it occurs in the guard formula of fik-kh in th® form Xi ~ c. 

If the z-th clock is not reset in the edges e\, . . . , Ck-i, then we replace the 
conjunct x^ = x*f~^ -\- z^ hy x^ = Xi -\- z^ -\- . . . z^ . 

Otherwise, let I be the largest index of an edge e/ with a reset of the z-th 
clock. Then we replace x^ = x^~^ -\- z^ hy x^ = z^ z^ . 




236 Supratik Mukhopadhyay and Andreas Podelski 



If k = m, the first case cannot arise due to the first condition on stratified 
strings (the z-th clock must be reset at least once in the edges Ci, . . . , Cm). That 
is, we replace + always by a conjunct of the form x^ = z’‘ + . . . + z^. 

If the conjunct ~ c appears in "ipk+i, then, by assumption on w (the second 

condition for stratified strings), the z-th clock is reset in an edge ei where I < k. 
Therefore, we can replace the conjunct x^ ^ c hy zi + . . . + Zk ^ c. 

Now, each variable x^ (for 0 < k < m) has exactly one occurrence, namely 
in a conjunct C of the form xf = Xi + z^ + . . . z^ or x^ = z’' + . . . z^ . Hence, the 
quantifier 3x^ can be moved inside, before the conjunct C; the formula 3x^ C 
can be replaced by true. 

After the above replacements, all conjuncts are of the form x™ = z^ + . . . + z'^ 
or of the form z^ + ... + z'^ ~ c; as explained above, this is sufficient to show 
the statement. □ 

We say that an edge e is reset-free if Resete = 0, i.e., its action is of the form 
Oe = Ai=i n^i — String w of edges is reset-free if all its edges are. 

Proposition 2. If the string w is reset-free, and the successor constraint of a 
time-closed constraint of the form L = i A (p is of the form L = f A p' , then <p> 
entails ip, formally p' \= p. 

Proof. It is sufficient to show the statement for w consisting of only one reset-free 
edge e. Since p is time-closed, it is equivalent to {3x3z{p Ax' = x-\- z))[x/x'\. 

Then |zc](T = lAp) is equivalent to (3 ... (T = P ApAx' = £c-|-z' A 7 e(a:') A 
x" = x' P z'^\xjx"\. This constraint is equivalent to L = I' A p{x) A y{x). This 
shows the statement. □ 

3 Zone Trees and Symbolic Forward Analysis 

Definition 3 (Zone Tree). The zone tree of a timed automaton U is an infinite 
tree with domain S* (i.e., the nodes are the strings over E ) that labels the node w 
by the constraint |zc](t)°). 

That is, the root e is labeled by the initial constraint p^ . For each node w la- 
beled p, and for each edge e G £ of the timed automaton, the successor node w.e 
is labeled by the constraint |e]((p). Clearly, the (infinite) disjunction of all con- 
straints labeling a node of the zone tree represents all reachable positions of U. 

We are interested in the termination of various versions of symbolic forward 
analysis of a timed automaton lA. All versions have in common that they traverse 
(a finite prefix of) its zone tree, in a particular order. The following definition of 
a non-deterministic procedure abstracts away from that specific order. 

Definition 4 (Symbolic Forward Analysis). A symbolic forward analysis 
of a timed automaton U is a procedure that enumerates constraints pi labeling 
the nodes Wi of the zone tree of U in a tree order such that the enumerated 
constraints together represent all reachable positions. Formally, 




Beyond Region Graphs: Symbolic Forward Analysis of Timed Automata 237 



— (fii = for 0 < i < B where the hound B is a natural number or uj, 

— if Wi is a prefix of Wj then i < j, 

— the disjunction Vo<i<s ‘T’t equivalent to the disjunction 

We assume that the constraint (pi is computed by applying any of the known 
quantifier elimination algorithms (see e.g. [12]) to a conjunction of constraints. 

The number i is a leaf of a symbolic forward analysis if the node Wi is a leaf 
of the tree formed by all the nodes Wi where 0 < i < B. 

We say that a symbolic forward analysis terminates if the bound B is finite 
(i.e. not uj). We define that symbolic forward analysis terminates with local sub- 
sumption if for all its leafs i there exists j < i such that the constraint entails 
the constraint ipj. In contrast, it terminates with global subsumption if for all 
its leafs i there the constraint pi entails the disjunction of all constraints pj 
where j < i. Model checking is more efficient with local subsumption than with 
global subsumption, both practically and theoretically [5]. 

A depth-first symbolic forward analysis depends on a chosen order of edges. 
Symbolic forward analysis terminates if and only if the depth-first symbolic 
forward analysis of U terminates for every order chosen. 

If the symbolic depth-first forward analysis of U terminates for at least one 
order of edges, then also the breadth-first version terminates. The converse need 
not be true, as the counterexample of Figure 1 shows. 



6 

3 y=<2 




Fig. 1. Example of a timed automaton for which the breadth- first version of 
symbolic forward analysis terminates but the depth-first version does not, if the 
edge numbered 4 is followed before the edge numbered 7. 



A path p in a zone tree is an infinite string over £, i.e., p S 5“; p contains 
a node w if the string w is a prefix of p, written w < p. A node v precedes a 
node u> if u is a prefix of w, written v < p. 



238 Supratik Mukhopadhyay and Andreas Podelski 



Definition 5 (Local finiteness). A path p of a zone tree is locally finite if and 
only if it contains a node w labeled by a constraint that entails the constraint 
labeling some node v preceding w (formally, there exist v and w such that v < 
w < p and |w](v5°) |= |f](v3°)y). A zone tree is locally finite if every path is. 



Proposition 3. Every symbolic forward analysis of a timed automaton lA ter- 
minates with local subsumption if and only if the zone tree ofU is locally finite. 

We will next investigate the special class of strings (that we call cycles) that 
correspond to cycles in the control graph of the given timed automaton. Each 
cycle in the graph-theoretic sense corresponds to finitely many cycles in the sense 
defined here (as strings), depending on the entry location. 

We say that an edge e of the form L = £ .. . \ L' = £' . . . leads from the lo- 
cation £ to the location £' . This terminology refects the fact that there exists 
a directed edge from i to £' labeled by the corresponding guarded command in 
the control graph of the given timed automaton (we will not formally introduce 
the control graph). Semantically, all transitions using such an edge go from a 
position with the location ^ to a position with the location £' . We canonically 
extend the terminology ‘leads to’ from edges e to strings w of edges. 

Definition 6 (Cycle). The string w = e\ Cm of length m>\ is a cycle if 

the sequence of edges ei, . . . , Cm lead from a location £ to the same location £ 
such that there exists a sequence of edges that leads from the initial location £^ 
to £ whose last edge is different from Cm- 

The last condition above expresses that £ is an entry point to the corresponding 
cycle in the control graph of the given timed automaton hi. The next notion is 
used in effective sufficient termination conditions. 

Definition 7 (Simple Cycle). A cycle w = e\ Cm is called simple if it does 

not contain a proper subcycle; formally, no string Ci Cj where 1 < i < j < m 

is also a cycle. 



Proposition 4. A locally infinite path p G in the zone tree of the timed au- 
tomaton lA contains infinitely many occurrences of a simple cycle w; formally, p 
is an element of the omega-language {E* .w)^ . 

Proof. Let p be a locally infinite path. Then there exists a location £ such that 
infinitely many nodes on this path are labeled by ^ (i.e. a constraint of the 
form L = £ A . . .. The strings formed by the edges connecting two nodes labeled 
by £ must all contain a simple cycle. Since the number of simple cycles is finite, 
some simple cycles must be repeated infinitely often. □ 

A string is stratifiable if contains a stratified substring (a substring of a string 
Cl Cm is any string of the form 6j where 1 < f < j < m). 




Beyond Region Graphs: Symbolic Forward Analysis of Timed Automata 239 




Fig. 2. Example of a timed automaton showing that the property: “Every reach- 
able location is reachable through a simple path” does not entail termination of 
depth- first symbolic forward analysis. 



Proposition 5. If every simple cycle of the timed automaton lA is either reset- 
free or stratifiable, the zone tree of hi is locally finite. 

Proof. Follows from Propositions 1, 2 and 4. □ 

We apply the above results to obtain our first sufficient termination condition. 

Theorem 1. Symbolic depth-first forward analysis of a timed automaton hi ter- 
minates if all simple cycles of hi are either reset- free or stratifiable. 

Proof. Follows from Propositions 3 and 5. □ 



4 RQ Automata 

A timed automaton hi is called RQ [10] if for each clock x, U contains exactly one 
edge with a reset of x and exactly one edge with a query of x, and moreover, for 
every transition sequence of lA starting from the initial position, the sequence (rf 
resets and queries of x is alternating, with a reset before the first query; here, lA 
refers to the timed automatonfrom hi obtained by replacing all conjuncts a; ~ c 
in the guard formulas by the conjunct a; > 0. We may require wlog. that no 
edge e of a timed automaton hi contains both a reset of a clock and a query of 
a clock. 

RQ automata have the following interesting property: if a location is reach- 
able then it is reachable through a simple path, i.e. a sequence of edges that form 
a string not containing a cycle [10]. So it is possible to derive specialized termi- 
nating graph algorithms for reachability for RQ automata. Moreover, a cycle is 
traversable infinitely often if it is traversable once [10]. We will now investigate 
how a generic model checker based on symbolic forward analysis behaves on RQ 
automata. We do not know whether we obtain termination for this special case. 
We know that the distinguished property of RQ automata (that reachability is 
equivalent to reachability through a simple path) by itself is not sufficient for 
termination; Figure 2 gives a counterexample. 



240 Supratik Mukhopadhyay and Andreas Podelski 



We will consider two special classes of RQ automata. The first one is char- 
acterized by the cut condition. 

A timed automaton lA satisfies the cut condition if any two simple cycles w 
and w' are either identical or their sets of edges are disjoint. Graph-theoretically, 
every simple cycle in the control graph has exactly one entry point (which is then 
called the ‘cut vertex’). 

Theorem 2. Symbolic depth-first forward analysis of an RQ timed automaton U 
terminates if it satisfies the cut condition and in every simple cycle, either all 
or no clock is reset. 

Proof. A simple cycle containing a reset for each clock in an RQ automaton sat- 
isfying the cut condition is stratified. Hence, Theorem 1 yields the statement. □ 

The second class of RQ automata is obtained by restricting the number of clocks 
to two. 

Theorem 3. Symbolic depth-first forward analysis of an RQ timed automaton 
with two clocks terminates. 

Proof. We name the two clock variables of the automaton x and y. We note Rx 
the unique edge of the time automaton where x is reset, and Qy the one where x is 
queried; similarly we define Ry and Qy. By our non-proper restriction, Rx yf Qx 
etc.. 

A segment S' of a path p in a zone tree is a sequence of nodes ni, . . . , of 
the zone tree. The string w = ei . . . Cm-i labels the segment S if Um is reached 
from ni by following the edges ei, ... , Cm in the zone tree. 

For a proof by contradiction, assume that p is an infinite branch of the 
zone tree. By Proposition 4, there exists a simple cycle w (leading, say, from 
the location i to €} that repeats infinitely often on p. We write Si, S 2 , . . . for 
the segments that are labeled by w (in consecutive order). We write Li for the 
segment between Si and S^+i. We note v'‘ the string labeling the segment Lp, 
each string i;* is a cycle (leading also from the location ^to t). Below we will use 
the terminology ‘ic labels Sf and ‘i;* labels Lf . 

We first distinguish between the cases whether the edge Rx is part of the 
string w (“i?a; € w”) or not. 

Case 1 Rx & w. 

The edge Qx must then also be an element of w (if the cycle w can be executed 
once then even infinitely often [10]; if it contained Rx but not Qx then the RQ 
condition would be violated). 

Case 1.1 Ry e w. 

Again, we must have that Qy G w. 

We distinguish between the cases that the edge Ry appears strictly before the 
edge Qy in the strings w {^^Ry < Qy”) or after (“Qy < Ry”). 

Case 1.1.1 Ry < Qy. 

Repeating the above reasoning for x instead of y, we distinguish between the 
cases “i?y < Qy” and “Qy < Ry” . 



Beyond Region Graphs: Symbolic Forward Analysis of Timed Automata 241 



Case 1.1. 1.1 Rj: < Qx- 

The two assumptions Rx < Qx and Ry < Qy mean that the string w is stratified. 
Hence, by Proposition 1 , the successor constraint function wrt. w is constant. 
Hence, the constraint labeling the last node of S2 entails the constraint labeling 
the last node of Si. Thus, the path p is locally finite, which achieves the contra- 
diction. 

Case 1.1. 1.2 Qx < Rx- 

We distinguish the cases whether the edge Qx appears before the edge Ry or 
strictly after. 

Case 1.1. 1.2.1 Qx < Ry. 

Combining the assumptions leading to this case, namely Rx € w (and hence 
also Qx S w) and Ry & w (and hence also Qy € w) and Ry < Qy and Qx < Rx 
and Qx < Ry, we know that the string w is of the form w = w1.Qx.w2 such 
that W2 contains Rx and Ry. Hence, the substring W2 of w stratified. By Propo- 
sition 1 , the successor constraint function wrt. W2 is constant, and hence also 
the one wrt. w. As in the case above, we achieve a contradiction. 

Case 1.1. 1.2. 2 Ry < Qx. 

Again we combine the assumptions leading to this case: namely Rx, Qx, Ry, Qy G 
w and Ry < Qy and Qx < Rx and Ry < Qx. 

Only using that Ry < Rx, we know that the string w is of the form w = 

w1.Ry.w2.Rx.w3. 

One of the two cases, namely Rx ^ Tj or Rx S Li, will hold for infinitely many 
segments Li’s. 

Case 1.1.1.2.2.1 Rx ^ L,. 

Then also Qx ^ Li (because of the RQ-condition and since Li is a cycle). 

We then distinguish between the analogue cases for y instead of x. 

Case 1.1. 1.2. 2. 1.1 Ry ^ L,. 

Again, then Qy ^ Li. 

We are assuming that Rx,Qx, Ry,Qy ^ Li for infinitely many Li. We take two 
such segments, calling them L and V . Let v and v' be the string labeling (the 
edge linking the nodes in) L and L' . Then, the successor constraint functions 
wrt. V and v' are the identity. 

We form the stratified strings V = Rx.ws.v.wi.Ry and V' = Rx-Ws.v' .wi.Ry. 
Since the successor constraint functions wrt. v and v' are the identity, the suc- 
cessor constraint functions wrt. V and V' are the same eonstant function. The 
same reasoning as above leads to a contradiction. 

Case 1.1. 1.2. 2. 1.2 Ry e L,. 

Then also Qy G Li. Because of the RQ-condition and since the edge Ry pre- 
cedes Qy in Si, the first occurrence of Ry precedes the first occurrence of Qy 
in Li. Hence, the strings v and v' (defined as above, labeling of some Lfs) is of 
the form v = v1.Ry.v2 or v = v[.Ry.V2 where vi, V2, v[ and v'2 do not contain any 
reset or any query of a clock variable (and hence, yield the identity as the succes- 
sor constraint function). We form the stratified substrings V = Rx.w3.v1.Ry and 
V = Rx.W3.v[.Ry, which yield the same constant successor constraint function 
for the same reason as above. Again, this leads to a contradiction. 



242 Supratik Mukhopadhyay and Andreas Podelski 



Case 1.1.1.2.2.2 € L,. 

Again, then Qx € Li. Now we are assuming that RxiQx-, Ry-iQy € Li for in- 
finitely many Li. 

As in Case 1.1. 1.2. 2. 1.2, the first occurrence of Ry must precede the first occur- 
rence of Qy in Li. 

Assume that there is a reset of x in Li before the first reset of y. We form the 
string Rx.W 2 -Vi, Rx where w = w 1 .Rx.w 2 is such that W 2 does not contain any 
reset (by the assumptions for the cases 1.1. 1.2 and 1.1. 1.2. 2) and r;* = v 1 .Rx.v 2 
(the string labeling Li) is such that vi does not contain any reset. Following 
the lines of the proof for Proposition 2 one can show that for any constraint (p, 
\Rx.W 2 .vi.Rx\{<p) entails |i?a,](<p). This is a contradiction (to the fact that the 
path p is locally infinite) . 

Assume that there is no reset of x in Li before the first reset of y. Then the 
string formed by the edges leading from the reset of x in Si to the first reset of y 
in Li is stratified. We can then apply the same reasoning as in Case 1.1. 1.2.1 to 
derive a contradiction. 

Case 1.1.2 Qy < Ry. 

Thus now Rx € w (and hence Qx € w), Ry € w (and hence Qy G w) and Qy < 
Ry. Now we consider the following subcases of this case. 

Case 1.1. 2.1 R^ < Qx 

This case is symmetric to Case 1.1. 1.2.1 where Rx,Ry G w, Qx < Ry and 
Ry "G Qy. 

Case 1.1. 2. 2 Qx < Rx- 

The assumption of the case is that the reset occurs after the query for both 
clocks. Due to the RQ condition, there cannot be any query between the two 
resets. Therefore, Rx.w\.Ry (or, symmetrically, Ry.w\.Rx) forms a stratified sub- 
string of w. As before, we obtain a contradiction. 

We refer to the full version of this paper [13] for the remaining cases. □ 

5 Future Work 

The presented work targets theoretical investigations of timed automata not at 
the verification problem itself but, instead, at the termination behavior of the 
procedure solving it in practice, namely symbolic forward analysis. This work is a 
potential starting point for deriving interesting sufficient termination conditions. 
There are, however, other open questions along these lines. 

Our setup may also be used to derive necessary termination conditions. These 
are useful obviously in the cases when their test is negative. Another question is 
whether there exist decidable necessary and sufficient conditions. 

We may also consider logical equivalence instead of local subsumption for a 
practically more efficient, but theoretically weaker fixpoint test (used in tools 
such as Uppaal [11]). We observe that Proposition 1 is still directly applicable in 
the new context, but Proposition 2 is not. The comparison of the different fix- 
point tests (equivalence, local and global subsumption) is an interesting subject 
of research. 



Beyond Region Graphs: Symbolic Forward Analysis of Timed Automata 243 



We may be able to derive natural and less restrictive sufficient termination 
conditions when we consider the enhancement of symbolic forward analysis with 
techniques from [3] to compute the effect of loops, i.e. essentially the constraint 
transformer |w“] for simple cycles w. 

The constraint transformers |tc] form a ‘symbolic version’ of the syntactic 
monoid [6] for timed automata. This notion may be of intrinsic interest and 
deserve further study. 



Acknowledgement 

We thank Tom Henzinger and Jean-Francois Raskin for discussions. 



References 

1. R. Alur and D. Dill. A theory of timed automata. Theoretical Computer Science, 
126(2):183-236, 1994. 232, 232, 232 

2. F. Balarin. Approximate reachability analysis of timed automata. In Proceedings of 
17th IEEE Real-Time Systems Symposium, pages 52-61. IEEE Computer Society 
Press, 1996. 232 

3. B. Boigelot. Symbolic Methods for Exploring Infinite State Spaces. PhD thesis, 
Universite de Liege, 1998. 243 

4. C. Daws and S. Tripakis. Model checking of real-time reachability properties using 
abstractions. In B. Steffen, editor, Proceedings of the 4th International Conference 
on Tools and Algorithms for the Construction of Systems, LNCS 1384, pages 313- 
329. Springer- Verlag, 1998. 232, 233, 234 

5. G. Delzanno and A. Podelski. Model checking in CLP. In Ranee Cleaveland, 
editor. Proceedings of TACAS’99, the Second International Conference on Tools 
and Algorithms for the Construction and Analysis of Systems, volume 1579 of 
Springer LNCS. Springer- Verlag, 1999. 237 

6. S. Eilenberg. Automata, Languages and Machines, volume B. Academic Press, 
1976. 243 

7. T. A. Henzinger, P. W. Kopke, A. Puri, and P. Varaiya. What’s decidable about 
hybrid automata? In Proceedings of the 27th Annual Symposium on Theory of 
Computing, pages 373-382. ACM Press, 1995. 232, 233 

8. T. A. Henzinger, O. Kupferman, and S. Qadeer. From pre-historic to post- 
modern symbolic model checking. In Proceedings of the International Conference 
on Computer-Aided Verification, pages 195-206. Springer, 1998. 232 

9. T.A. Henzinger, X. Nicollin, J. Sifakis, and S. Yovine. Symbolic model checking for 
real-time systems. Information and Computation, lll(2):193-244, 1994. Special 
issue for LICS 92. 232 

10. W. K. C. Lam and R. K. Brayton. Alternating RQ timed automata. In C. Courcou- 
betis, editor, Proceedings of the 5th International Conference on Computer-Aided 
Verification, LNCS 697, pages 236-252. Springer- Verlag, 1993. 239, 239, 239, 240 

11. K.G. Larsen, P. Pettersson, and W. Yi. Compositional and symbolic model check- 
ing of real-time systems. In Proceedings of the 16th Annual Real-time Systems 
Symposium, pages 76-87. IEEE Computer Society Press, 1995. 232, 232, 242 

12. K. Marriott and P. J. Stuckey. Programming with Constraints: An Introduction. 
MIT Press, 1998. 237 



244 Supratik Mukhopadhyay and Andreas Podelski 



13. S. Mukhopadhyay and A. Podelski. Beyond region graphs: Symbolic forward 
analysis of timed automata, 1999. Full Version. Available at http://www.mpi- 
sb.mpg.de/~podelski. 242 



Implicit Temporal Query Languages: Towards 

Completeness 



Nicole Bidoit^ and Sandra de Amo^* 



^ LaBRI-UMR 5800 du CNRS, Universite Bordeaux 1, France 
Nicole . Bidoit@labri . u-bordeaux . f r 

^ Departement of Computer Science, Federal University of Uberlandia, Brazil 

deamo@ufu.br 



Abstract. In the propositional case, it is known that temporal logic 
(TL) and first order logic with timestamp (TS-FO) have the same ex- 
pressive power. Recent work has proved that, in contrast, there are first 
order logic queries on timestamp databases that are not expressible in 
first order temporal logic: TL is not complete. Specifying a complete im- 
plicit temporal query language remains an open problem. We investigate 
two extensions of TL, namely NTL and RNTL. A strict hierarchy in 
expressive power among fragments RNTL* of RNTL is established. On 
the one hand, it leads to the conjecture that RNTL is complete. On the 
other hand, it provides a new promising perspective towards proving the 
well-known conjecture that there is a strict hierarchy among the i time- 
variable fragments TS-FO* of TS-FO. 

Keywords: Temporal database, Query languages, Temporal logic, Ex- 
pressive power. Communication protocol. 



1 Introduction 

There are two alternative ways [5,6] of extending the relational model in order 
to represent temporal data. The first approach captures time in an implicit 
manner: a relationnal temporal database instance is then a finite sequence of 
relational instances. The second approach relies on augmenting each relation 
with a “timestamp” column storing the time instants of validity of each tuple. 
Figure 1 illustrates these two equivalent representations. 

In the context of an implicit representation of time, query languages, called 
implicit temporal query languages, are usually based on first order temporal 
logic [8]. When time is explicitly represented, queries are specified using the 
standard relational query languages [3] with built-in linear order on the times- 
tamps. One of these languages, called TS-FO, is the relational calculus (i.e. first 
order logic) with timestamps. A non trivial question arises then: how implicit 

* Work by this author was funded by the University Bordeaux 1 and done dnring her 
visit at Labri in 1998-1999. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 245—257, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



246 



Nicole Bidoit and Sandra de Amo 



T= h h h 



R 


A 


R 


A 


R 


A 




a 




b 




a 




b 




d 




c 




c 














A T 



a 1 
b 1 
c 1 
b 2 
d 2 



a 3 
c 3 



Fig. 1. The implicit and explicit representations of time 



temporal languages and explicit temporal languages relate to each other with re- 
spect to expressive power. It has been studied from various angles in [2] [5] [7] [9] 
[13] [10], 

[2] and [1] provide a hierarchy of temporal languages^ with respect to ex- 
pressivity (see also [13]) : (1) it is shown that future first order temporal logic 
(FTL) is strictly weaker than first temporal order logic (TL) and (2) that TL 
is strictly weaker than TS-FO. The first result (FTL C TL) follows from the 
fact that the query Does there exist an instant (in the future) whose 
state equals the initial state? cannot be expressed in FTL but is express- 
ible in TL. The second result (TL C TS-FO) is derived by showing that the query 
Does there exist two distinct instants (in the future) whose resp- 
ective states are equal? cannot be expressed in TL but is expressible in 
TS-FO. These two results are of major interest and stand in contrast with the 
propositional case. In [9], the notion of complete temporal language is introduced 
via equivalence with TS-FO and the authors show that propositional TL is com- 
plete (see also [11]). Moreover, it is shown that propositional FTL is equivalent 
to propositional TL. 

In the present paper, we study the following open problem : find an implicit 
first order temporal language which is complete i.e. equivalent to TS-FO. We 
enrich the hierarchy described above by investigating two languages NTL and 
RNTL. Both languages are shown to be more powerful than TL. The language 
NTL is not complete. The language RNTL is more expressive than NTL. Com- 
pleteness of RNTL remains a conjecture. 

The language NTL is the first order linear version of NCTL*[12]. It extends 
TL by a temporal modality H (“From Now On”). Intuitively, the modality H 
allows one to choose a new initial time instant (called relative origin) and for- 
get about all previous instants. In this sense, it can be said that NTL intro- 
duces a notion of relative past: the past modalities (Previous and Since) are 
evaluated with respect to the relative origin. This stands in contrast with TL 
whose modalities are of course always evaluated with respect to the absolute 
origin. Relative past increases the expressive power of TL in the first order 
case even if in the propositional case relative past is redundant with the other 
temporal operators [12]. However, the language NTL is not complete: we show 
that the query Does there exist 3 distinct instants whose respective 

^ Other languages investigated by these authors are stronger with respect to expres- 
sivity than TS-FO and TL. 



Implicit Temporal Query Languages: Towards Completeness 247 

states (let say S\, S 2 and S 3 ) satisfy Si D S 2 = 9 and U S '2 = S 3 ? 
is expressible in TS-FO but not in NTL^ . This result is proved by extending the 
proof technique based on communication protocol developed in [2] . 

Because NTL fails to be complete, we extend it by investigating a rather 
simple idea: allowing one to forget the past is coupled together with allowing 
one to restore the past. The implicit language RNTL is defined by introducing a 
temporal modality 3? whose task is to restore the segment of the past which has 
been “removed” by the last “application” of H. Once again, the ability to restore 
the past does not add any expressive power in the propositional case [4]. However, 
in the first order case, RNTL is strictly more expressive than NTL. A strict 
hierarchy in expressive power among fragments RNTL* of RNTL is established. 
The fragment RNTL* is defined by restriction on the maximal number of 3? 
operators in formulas. On the one hand, this leads to the conjecture that RNTL is 
complete. On the other hand, this provides a new promising perspective towards 
proving the well-known conjecture that there is a strict hierarchy in expressive 
power among the i time- variable fragments TS-FO* of TS-FO. The fragment 
TS-FO* is the subclass of TS-FO formulas built by restricting the number of 
distinct time- variables to be at most i. 

The paper is organized as follows. The next section is devoted to introductory 
material. Section 3 discusses the expressive power of NTL and introduces the 
communication protocols which are the key tools for the main results of the pa- 
per. The last section introduces the language RNTL and investigates the RNTL* 
hierarchy as well as its relationship with the conjectured TS-FO* hierarchy. We 
also discuss [13] results. 

Because of space limitation, some technical proofs are omitted and other 
proofs are simply sketched. 



2 Preliminaries 

We assume the reader is familiar with relational databases concepts [3]. A 
database schema is a finite set of relation names with associated arity. An in- 
stance of a schema assigns to each relation name a finite relation of appropriate 
arity over a fixed countably infinite domain of data elements. The active domain 
of an instance is the set of all data elements appearing in some of its relations. 
Valuations of variables are always assumed in the active domain. A temporal 
instance over a database schema 7?. is a non-empty finite sequence X=/i,. . .,// 
{I > 1) of instances of TZ. 

The query language TL. Temporal logic [8] is an obvious candidate language for 
querying temporal databases. The syntax of TL over some database schema TZ is 
obtained using the formation rules for standard first order logic over TZ together 
with the additional formation rule: if <pi et ip 2 are formulas then ipi Until ip 2 , 
ifi Since </32, Next tpi and Prev tpi are formulas. 

One says that the query separates NTL and TS-FO. 



2 



248 



Nicole Bidoit and Sandra de Amo 



The semantics of TL is briefly recalled. Given a temporal instance X=/i,. . .,// 
over TZ and a TL formula the truth of (p at time i € given the 

valuation v of the free variables of p, denoted \I,i,v\ \=ti <p, is defined as follows: 

\=ti R(ti, , . . . ,tfe) if {v{ti),iy{t2),...,iy{ti)) G Ii(R). 

If is a boolean combination of formulas or a quantification (3, V) of a 
formula then the definition is as usual. 

\=ti (pi Until ip 2 iff there exists j > i such that [2,j,iy] \=ti p 2 and for 
each k such that i < k < j, [J,k,v] \=ti pi- 

\=ti Pi Since p 2 iff there exists j < i such that \=ti Pi and for 

each k such that j < k <i, [I,k,v] \=ti p\. 

\I,i,v\ \=ti Next pi'\Ei<l and [I,* + l,v] \=ti pi- 

[I,i,v] \=ti Prev pi iE i > \ and - l,v] \=u p\. 

It is sometimes convenient to use the following derived temporal modalities: 

F Pi = true Until pi {“sometimes in the future pi^), P pi = true since pi 
{“sometimes in the past pi”), first = ^ Prev True {“initial state”), last = 
Next True {“final state”). 

A query Q in TL is specified by {("af) | p{Hf)} where p{Hf) is a formula in 

TL and It its free variables. The answer of Q over a temporal instance I is the 

relation Q(X)={z/( x^) | ^ti p{~^), v a valuation of it}. 

The timestamp representation and the language TS-FO. A timestamp temporal 
instance over a schema 72. is a two sorted relational structure over the 
timestamp schema 72®®*. The schema 72®®* contains for each fc-ary relation R of 
72 an extended relation i?®®* of arity k + 1. The extra column of this relation 
holds timestamps, the other ones hold data elements. Given an implicit temporal 
instance I=Ii,. . .,Ii its timestamp representation is denoted X®®*. 

TS-FO is a query language over timestamp temporal databases defined in 
a straightforward way by two-sorted first order formulas. The data variables 
ranges over data elements and time variables over integers. 

A query Q in TS-FO is specified by {it \ p{lt)} where p{lt) is a formula in 
TS-FO and It its free variables, each free variable is of data sort. The answer 
of Q evaluated on a timestamp temporal instance X®®* is the relation Q(X®®*) 
={v{lt) I X®®* 1= p{lt), V a valuation of '^}. 

Example 1. Let 72 and X be the schema and temporal instance of figure 1. The 
TL query {(x) | R(x) A 3 y (R(l/) Until (R(a;)A last)) } evaluated on X returns 
{a, c}. This query can be equivalently expressed by the TS-FO expression {(x) 

I R®®*(a:, 1) A 3 y 3t (t > 1 A R®®*(x,t) A true(t 3- 1) A Vs ((1 < s A s < t) ^ 
R^^\y,s)))}. 

3 The Expressive Power of NTL 

In this section, we investigate the implicit temporal query language NTL. This 
language has been initially introduced in [12] as a propositional branched tem- 
poral language. Here we consider the linear first order version of this language. 



Implicit Temporal Query Languages: Towards Completeness 249 



NTL extends TL by a modality H. Intuitively, this modality stands for from now 
on or henceforth. It is meant to restrict the scope of past-time operators by for- 
getting the past instants with respect to an instant which becomes the (relative) 
origin of time. In the propositional case, [12] shows that this new modality is 
redundant with pure- future modalities. Here we will show that, in the first order 
case, NTL is strictly more expressive than TL. However, NTL is not complete 
since it is not equivalent to TS-FO. 

The syntax of NTL is defined by adding the following formation rule to the 
ones defining TL: if tp is a formula, so is H ip. 

Notation: Let X = (/i, ...,/;) be a temporal instance. The temporal instance 
J" = Ji ,. . ., Ji-i+i such that for j = 1, ...,/ — H- 1, Jj=Ij+i-i is denoted by X\i. 
Intuitively, X\i is the instance X where the first i — \ states have been removed. 

Definition 1. Given a temporal instance X=(/i,. . .,/;) and a NTL formula ip, 
the truth of ip at time p, given the relative origin t and the valuation ly of the 
free variables, denoted [X,t,p,iy] \=nti <P is defined as follows: 

— if is obtained by TL formation rules then [X,t,p, v] \=nti P is defined in a 
way similar to [X^t^P — t + l,iy] \=a p. 

- If is of the form H ipi then [X, t,p, v] \=nti P if \X,p,p, v] \=nti Pi- □ 

The effect of the temporal operator H is to position the relative origin at p, 
leaving the value p of the “at” time unchanged. 

Example 2. The formula F’H(R(d) A first) is satisfied by the temporal instance 
of Figure 1. Intuitively, in X, there exists an instant (here 2) such that when 
the previous states are removed (here state 1 only), it is the first instant and its 
associated state contains R(d). 

The answer of the NTL query {("S^) | evaluated on the temporal 

instance X is the relation \ \X,l,l,v] |=nti v a valuation of }. 

NTL versus TL The first result establishes that the modality H increases the 
expressive power of TL. This is entailed by the query Does there exist two 
distinct instants whose respective states are equal? which is not ex- 
pressible in TL (see the introduction) but can be expressed in NTL by the 
boolean formula {() | FHF(Va;(R(a;) ^ P(first A R(a;))))}. So we have: 

Theorem 1. TL C NTL 

NTL versus TS-FO The increase in expressive power provided by the temporal 
operator H is not sufficient to express all TS-FO queries. Of course, each NTL 
query can be expressed as a TS-FO query. This is implied by the following first 
result: 

Lemma 1. Let p(Hf) be a NTL formula. There exists a TS-FO formula 
iflijjjlT) such that for each implicit temporal instance X, [X,t,p,iy] \=nti P{^) 
iff [I®®*,/i] 1= where p is the valuation v extended by [i/t, j/p]. 



250 



Nicole Bidoit and Sandra de Amo 



The proof does not present any difficuly and it is not presented here. We now 
show that the inclusion NTLCTS-FO is in fact a strict inclusion: 

Theorem 2. NTL C TS-FO 

The proof technique is a generalization of that introduced in [2] and is based 
on the following three steps: 

1. We extend the communication protocols proposed in [2]. These protocols al- 
low us to introduce the notion of polynomial communication complexity of fc-ary 
predicates on sets of sets of data elements. 

2. We exhibit a ternary predicate whose communication complexity is not poly- 
nomial. 

3. Finally, it is proved that if a ternary predicate can be expressed by a NTL 
query, then it has a polynomial communication complexity. 

Communication protocols revisited Let T> he a, finite set of elements. Sets of 
non-empty subsets of T>, called super-sets are denoted by ff, Jl, Z ... The com- 
munication protocols involve two partners I and II who exchange messages i.e. 
finite relations of fixed arity over T>. The two partners are supposed to collaborate 
in the process of computing some predicate on super-sets. Given n super-sets, 
they are distributed over the two partners in such a way that they share n — 2 
super-sets. This implies that each partner hides one super-set to the other. The 
super-sets shared are called the pivot and the number of super-sets shared the 
order of the pivot. 

Definition 2. A communication protocol Pr of arity k with pivot of order v is 
specified by two abstract functions /, g. A message of arity fc is a finite subset 
of or a boolean (true or false) . 

Given v + 2 super-sets enumerated by y, 2, Ai, ..., Xy, the execution of the pro- 
tocol Pr over 3^, Z with pivot Ai, ..., Xy, is a finite sequence ((oi, 6 i) . . . (a^, by)) 
of pairs of messages of arity k exchanged by the partners I and II as follows: 

Qi = f{T>, Ai, ..., Xy, y, [&i, 62 , . . . , bi-i]) is the ith message of I to II. 
bi = 5 ( 12 , Ai, ..., Xy, 2, [fli, 02 , . . . , Oi]) is the ith answer of II to I. □ 

Next, we omit to specify the arity of a communication protocol and denote 
by Pr„ a communication protocol whose pivot is of order v. For each r; -|- 2-tuples 
of super-sets y, 2, Ai, ..., A„, Pr„(Ai, . . . , A^||3^, Z) denotes the last message o^ 
of partner I. We say that the protocol is boolean when this last message is a 
boolean. The notion of polynomial communication complexity is introduced as 
follows: 

Definition 3. The communication protocol Pr„ is polynomial if there exists a 
function k of polynomial order such that for each ti-|- 2 -tuple of super-sets y, 2, 
Ai, ..., Xy we have: r = k(X]Li I I) where ar=Pvy{Xi, . . . ,Xy\\y ,2). □ 

The messages exchanged by the two partners during the execution of a com- 
munication protocol aims at computing a predicate of arity v + 2 when the order 
of the pivot is v. 



Implicit Temporal Query Languages: Towards Completeness 251 



Definition 4. A boolean communication protocol Pr„ computes the predicate V 
of arity r; + 2 if there exists a permutation tt on {1 , . . . , r; + 2} such that for each 
tuple of super-sets Ai, Xy ^2 we have: 

r{Xi, Xy+ 2 ) = tree iff Pr^(A^(i),...,A^(„)||A^(^+i),A^(^+2)) = true □ 

Next, we say that the predicate V of arity r; -I- 2 has a polynomial commu- 
nication complexity (for short, we say that V is ri-polynomial) if there exists a 
polynomial communication protocol Pr„ computing V . Note that the communi- 
cation protocols defined in [2] correspond exactly to our protocols with pivot of 
order 0 and that the predicates considered there are binary. 

For the purpose of proving that NTLcTS-FO, we will focus on communica- 
tion protocol with pivot of order 1 and on ternary predicates. Communication 
protocols with pivot of higher order will be brought into play in the last section 
when investigating a hierarchy of subclasses of RNTL. 

Example 3. The predicate specified by there exists X £ X ,Y £y and Z £ Z 
such that X = Y = Z is 1-polynomial. Assuming that the partner I (resp. II) 
knows the super-sets X and y (resp. y and Z), the protocol computing this 
predicate proceeds as follows: partner I identifies the sets in A n 3^ and sends 
them one by one to its partner who checks if one of these sets is also in 3^ n Z. 
Thus the number of exchanges is bounded by | 3^ | . 

Lemma 2. The predicate N-SEP(A,3f,Z) specified by there exists A e A, 
Y ^y and Z € Z such that A n Y = 0 and A U Y = Z is not 1-polynomial. 

Proof (Sketch): Let us assume that there exists a polynomial communication 
protocol computing N-SEP with A as the pivot. Let us fix A as {{po}}- Then 
the number of exchanges needed to derive N-SEP(A,Y,Z) is constant for any 
choice of y and Z. On the other hand, for any choice of y and Z, checking N- 
SEP(A,Y,Z) leads to check TL-SEP(f^, V)^ where U = {Y \Y € y andpo ^ Y} 
and V = {Z — {pq\ | Z G Z and po G Z}. This implies that the property TL-SEP 
is 0-polynomial, a contradiction with [2] . The proof is similar when the pivot of 
the protocol assumed to compute N-SEP is y (resp. Z). Due to space limitation, 
these parts of the proof are not developed here. □ 

Complexity of NTL queries It remains to show that the language NTL is only 
able to express predicates (of arity 3) which are 1-polynomial. In order to link 
predicates and NTL queries, we restrict our attention to particular temporal 
instances meant to encode triples of super-sets. 

From now on the database schema is reduced to contain a unique unary 
relation. 

Definition 5. A temporal instance X= (Ji, ...,//) is 3-splitable (splitable, for 
short) if there exists exactly two distincts instants n and m such that (a) 2 < 
n -I- 1 < TO < ; - 1, (b) /„ = = 0, and (c) Vi yf j G [l,n - 1], A yf /y, 

Vi yf j G [n -I- 1, TO - 1], A y^ Ij, Vi yf j G [to -|- 1, /], A y^ Ij- □ 



® TL-SEP(W, V) is specified by W n V 7^ 0. 



252 



Nicole Bidoit and Sandra de Amo 



In the sequel, we say that a splitable temporal instance X encodes a tuple 
X2, X-i of super-sets when Xi = {Ii \ i = l..n —\},X2 = {Ii | i = n-l- l..m — 1}, 
and A3 = {/j \ i = m + l.d}. Thus when X encodes X\, A2, AI3, its left part 
defined by (/i,. . .,Im) and denoted by can be viewed as encoding the two 
super-sets known by the partner I of a communication protocol. Symmetrically, 
the right part of X defined by X|„ and denoted X“'‘g*'t viewed as encoding 

the two super-sets known by the partner II of a communication protocol. 

Definition 6. A boolean NTL formula (p expresses the predicate V of arity 3 
if there exists a permutation tt on {1,2,3} such that for each tuple of super- 
sets (A’i,A’ 2, A3) we have: V{Xi, X2, A3)=true iff X \=nti P where X encodes 

In order to prove that the predicates expressible in NTL are always 1- 
polynomial, we need to define an intermediate language called split-NTL which 
has the same expressive power as NTL over splitable temporal instances. Syn- 
tactically, split-NTL is a version of NTL where each temporal modality has a 
left and a right versions. For instance. Next is replaced by Next''‘s^* and Next*®^*. 
Informally, the left (resp. right) version of a temporal operator behaves like the 
operator itself except that it is intended to be evaluated on the left part of the 
splitable temporal instance. Defining the semantics of split-NTL requires two 
functions left and right presented below: 

• for i € [l..m], left(z) = i and for i G [m+ l../],left(i) = m. 

• for i G [\..n — l],right(z) = 1 and for i G [n../], right (i) = i — n + 

Definition 7. Given a splitable temporal instance 1= (Ii, ...,/;) and a for- 
mula (p of split-NTL, the truth of p at time p given the relative origin t and 
the valuation 1/ of the free variables, denoted [X, t,p, j/] ^sput P, is defined^ as 
follows: 

• Let OP be either Next or Prev or H 

If p is OP'®^Vi then [X,t,p,v] ^sput p if [X'®^‘, left(t), left(p), t, :/] |=ieft OP(/?i. 

If p is OP^s^^Vi then [X,t,p,v] p if right(t), right(p), t, j/] [=i.ight 

OP:/Pi. 

• The case where OP is a binary temporal operator is treated in a similar man- 
ner. □ 

The relations |=ieft and |=right are defined for hybrid formulas whose upper 
temporal operator is unmarked but whose subformulas are in split-NTL. Unfor- 
mally, these relations are defined like \=nti over either the left or right part of a 
splitable temporal instance. Due to space limitation, these relations are not for- 
mally defined here. For instance, [X^®^*, i,j, t, ly] ^left pi Since p2 if there exists u 
such that i < u < j and [X,t,u,i'] ^spiit P2 and Vk G]u,j], [X,t,k,i'] |=spiit Pi- 
Note that in [X'®^*, i, j, t, \= left ... i is the relative origin , j is the “at” instant 
of evaluation with respect to the left (or right) part of X and t stores the relative 
origin with respect to the entire splitable temporal instance X. 

The definition of |=spiit is only given for the temporal operators. 



4 



Implicit Temporal Query Languages: Towards Completeness 253 



The next result shows that split-NTL can be used instead of NTL to query 
splitable temporal instances. 

Lemma 3. Given a NTL formula there exits a formula ip of split-NTL 
such that for each splitable temporal instance I we have: [X, i^] |=nti V iff 

[X, r/] |=spiit '*A- 

The proof is technical and does not present any difficulty. It is omitted here. 
This lemma is however the key to prove that: 

Theorem 3. Each ternary predicate V expressible by a boolean NTL formula 
is 1-polynomial. 

Proof (Sketch): Assume that is a boolean NTL formula which expresses 
the predicate V. By lemma 3, the predicate V is expressible by a split-NTL 
closed formula ■i/'. In order to show that V is 1-polynomial, we need to exhibit a 
polynomial communication protocol computing V . This protocol is derived from 
the split-NTL formula -0. The arity of the messages is given by the maximal 
number of free variables of any subformulas of tp. Of course, the order of the 
pivot is 1. The protocol computing V{X,y,Z) is now sketched while assuming 
that the triple (X,y,Z) is encoded by the splitable temporal instance X and 
that the partner I (resp. II) controls X and y (resp. y and Z) encoded by X^®^* 
(resp. X'*sh*). This entails that y is the pivot. It is assumed that all temporal 
subformulas of ip are enumerated by 'ipi, ....,tps = 0 such that each subformula 
occurs after its own subformulas. The execution protocol can be decomposed 
into s steps and each step turns out to consist of — n-|-l — i-|-l) = 

3^ I +3 — 0 exchanges of messages. 

Each step i corresponds to “what is necessary” for the evaluation of the 
split-NTL subformula tpi, assuming that all subformulas ipk,k < i have been 
dealt with. Let us consider for instance that the upper temporal modality of ipi 
is marked by left. In this case, the partner I who controls the left part of the 
instance, is going to evaluate the messages \ t,p, 1, ^left 0} for 

t = ltom — n-|-l and for p = t to m — n + 1. In order to do that, the part- 
ner I may have to use the preceeding messages computed for the subformulas 
tpk,k<i. □ 

The proof of theorem 2 stating that NTLcTS-FO follows directly from theo- 
rem 3, lemma 2 and the fact that the predicate N-SEP can be expressed by 
the TS-FO closed formula 3t 3s 3uyx{R{x,t) ^ ~^R{x,s)) A Vx(i?(a;,u) 
(R{x, t) V R{x, s))). 

4 The Expressive Power of RNTL 

The results of the previous section show that NTL is not complete. This moti- 
vates the investigation of a new extension of TL. The language RNTL extends 
NTL by the introduction of a modality 5R allowing one to restore the most re- 
cently erased segment of the past. The syntax of RNTL is defined by adding the 



254 



Nicole Bidoit and Sandra de Amo 



following formation rule to the ones defining NTL: if is a formula then IRtp is 
a formula. 

The ability to restore segments of the past requires that relative origins be 
memorized. These instants can be considered as checkpoints. Roughly speaking, 
RNTL adds to TL a stack of relative origins, the semantics of H formalizing push 
and the semantics of 3? formalizing pop. 

Definition 8. Given a temporal instance I=(/i,. . .,/;) and a RNTL formula (p, 
the truth of ip at time p, given the sequence of chekpoints (G, ..., tk) and the 
valuation i/, denoted [X, (G, ..., tfe),p, i^] \=mti P, is defined by: 

• If (/j is obtained by one of the formation rules defining TL then 

[I, (ti, ...,tfc),p, v] \=rnti P is defined in the same way as [I\t^,p-tk + l, v] \=ti P- 

• If is H then [X, (ti, ..., tfc),p, j/] \=mti P if [I,{ti, ...,tk,p),p,v] \=mti Pi- 

• If is 3? then [X, (G, ..., tfe),p, \=mti P if [I, (ti, ...,tk-i),p,v] \=mti Pi 

when k > 1 and [X, (ti),p, \=mti Pi, otherwise. □ 

Above, it is assumed that the list of checkpoints is nonempty (it always 
contains the absolute origin) and increasing. Multiple occurrences of an instant 
is allowed in this list. Note also that the last checkpoint tk is the “active” relative 
origin and obviously, it is assumed that the “at” instant p satisfies tk < p < I- 

NTL versus RNTL and TS-FO Although in the propositional case, the opera- 
tor 3? adds no expressive power to NTL [4], it is relatively simple to show that: 

Theorem 4. NTL C RNTL C TS-FO. 

In fact the predicate N-SEP which was used to separate NTL and TS-FO 
can be expressed by the following RNTL closed formula (pfR(x) denotes the 
subformula P(first A R{x))): 

FHFHF( Vx[3fJ pfR(x) ^ ^ pfR(x)]A Vx[(3fJ pfR(x) V pfR(x)) ^ i?(x)]). 
The proof that RNTL C TS-FO is technical but rather easy. 

A hierarchy for RNTL^ The completeness of RNTL remains a conjecture. The 
hierarchy of RNTL ’s subclasses presented next is an encouraging important step. 
The fragment RNTL* is the subset of RNTL formulas whose 3?-depth is less or 
equal to i. The 3?-depth of a formula is determined from its tree representation 
and the maximal number of modalities 3? over branches of this tree. For in- 
stance the formula expressing N-SEP is in RNTL^. Note also that NTL matches 
RNTL°. 

Theorem 5. 

• for i > 1, RNTL*-i C RNTL* 

• for i > 0 RNTL* C TS-FO 

The proof is similar to the one of theorem 2. Of course, it makes use of the 
general communication protocols introduced in section 3 for proving: 



Implicit Temporal Query Languages: Towards Completeness 255 



Lemma 4. (1) Each predicate of arity i + 2 (over super-sets) expressible by a 
boolean RNTL®“^ formula is z-polynomial. 

(2) There exists a predicate SEP-z of arity i + 2 which is not ^polynomial. 

The predicate SEP-i when I > 1 is specified by : SEP-i(fLi,. . .,Ai_|_ 2 )=true iff 
e Ti, . . . S Ti +2 such that Xj(^Xk = 0 for j, fee (j yf k) and 

U . . . U Xi+i = Xi+ 2 - For example, SEP-1 is the predicate N-SEP introduced 
in section 3 to separate NTL and TS-FO. 

The binary predicate SEP-0 is the predicate TL-SEP introduced in section 3 
and defined by SEP-0(A’i, T 2 )=true iff 3Xi G ffi, 3^2 G such that X\ = X^. 
It is one of the predicate introduced in [2] in order to separate TL and TS-FO. 

The predicate SEP-i of arity i -|- 2 is not i-polynomial thus not expressible in 
RNTL®“^. However it can be expressed in RNTL® and as a matter of fact this 
predicate (extended in a trivial way to make it of arity i-|-3) is i -I- 1-polynomial. 
Recall that: 

SEP-0=TL-SEP is not expressible in TL because it is not 0-polynomial but 
it is expressible in RNTL'^=NTL. A trivial protocol with pivot X 2 (thus of order 
1) which assigns both super-sets X\ and X 2 to partner I and X 2 together with 
the empty super-set to partner II computes SEP-0=TL-SEP. This protocol has 
a constant complexity. This shows that the predicate SEP-0 is 1-polynomial. 

SEP-1=N-SEP is not expressible in NTL because it is not 1-polynomial but 
it is expressible in RNTL^. It can be showed that this predicate is 2-polynomial. 

Relationship with TS-FO'' We conclude this section by some comments on high- 
tly interesting links between (1) the strict hierarchy in expressive power among 
the fragments RNTL® established above, and (2) the conjecture that there is 
a strict hierarchy in expressive power among the fragments of TS-FOL The 
language TS-FO® is the subclass of TS-FO formed by formulas built using at 
most i distinct time- variables. It is known that TS-FO^ C TS-FO^ C TS-FO^. 
However, for z > 3 the strict inclusion TS-FO® C TS-FO®'*'^ remains a conjecture. 

It has been already established and it is not difficult to verify that each TL 
formula is expressible by a formula in TS-FO using at most 3 distinct time- 
variables. The proof of the separation of TL and TS-FO [2] entails that TL is 
strictly less expressive than TS-FO^ because the predicate TL-SEP=SEP-0 can 
be expressed by a formula with 2 distinct (thus at most 3) time-variables. This 
generalizes when considering RNTL®: 

Lemma 5. RNTL® C TS-FO®+^ 

Note that a NTL formula can be translated into a TS-FO^ formula. The 
proof of separating RNTL® and RNTL®+^ entails that RNTL® is stricly less ex- 
pressive than TS-FO®'*'®*’: the predicate SEP-i-|-l can be expressed by a formula 
with i-|-2 (thus at most i-|-4) distinct time- variables. Note that one of the possible 
TS-FO formula translating SEP-i-|-l is 3ti . . . 3 U+ 2 [(yxR{x, ti) ^ -•R{x, 12)) A ... A 
[VxRix 

1 ti+l ) ^ ^R{x , ti+2)) A {'ixR{x,ti) V ... V R{x,ti+i )^R{x,U+ 2 ))] . 



256 



Nicole Bidoit and Sandra de Amo 



The main contribution of these results to the conjecture® that there exists a 
strict TS-FO* hierarchy is that, proving this conjecture reduces to proving that 
TS-FO* C RNTL^ for some k > i — 3. We are currently investigating techniques 
in order to prove this result. Completeness of RNTL would then directly follow. 

Note that results in [13] are not contradicting this conjecture. [13] proves 
that, for dense linear order (generalizable to discrete linear order), extending 
TL by a finite number of temporal connectives always leads to languages less 
expressive than TS-FO (their temporal connectives are formulas in TS-FO with 
a fixed finite number of free time variables) . NTL and each fragment RNTL* fall 
into this class of TL-extensions. However, we claim that it is not the case for 
RNTL. 

Acknowledgements 

We thank S. Abiteboul, J-M. Couvreur and D. Niwinski for informal helpful 
discussions. 



References 

1. Abiteboul, S., Herr, L. and Van den Bussche J.: Temporal Connectives Versus Ex- 
plicit Timestamps in Temporal Query Languages, In Recent Advances in Temporal 
Databases, S. Clifford and A. Tuzhilin, Eds, Springer Verlag (1995) 43-60 246 

2. Abiteboul, S., Herr, L. and Van den Bussche, J.: Temporal Versus First-Order 
Logic to Query Temporal Databases, Proceedings of PODS’96, (1996) 49-57 246, 
246, 247, 250, 250, 251, 251, 255, 255 

3. Abiteboul, S., Hull, R. and Vianu, V.: Foundations of Databases, Addison- Wesley 
(1995) 245, 247 

4. Bidoit, N. and De Amo, S.: Branching time temporal logic for querying multiversion 
databases, Technical Report in preparation (1999) 247, 254 

5. Chomicki, J.: Temporal Query Languages: a survey. Temporal Logic, First Int. 
conf., LNAI 827 (1994) 506-534 245, 246 

6. Chomicki, J. and Toman, D.: Temporal Logic in Information Systems, Logics for 
Databases and Information Systems (1998) 31-70 245 

7. Clifford, J., Croker, A. and Tuzhilin, A.: On Completeness of Historical Relational 
Query Languages, ACM Transactions on Database Systems, 19, 1 (1994) 64-116 
246 

8. Emerson, E. A.; Temporal and Modal Logic, In Handbook of Theoretical Computer 
Science, Volume B: Formal Models and Semantics, Jan van Leeuwen, Ed., Elsevier 
Science Publishers (1990) 995-1072 245, 247 

9. Gabbay, D., M., Pnueli, A., Shelah, S., and Stavi, J.: On the Temporal Basis of 
Fairness, Symposium on Principles of Programming Languages, (1980) 163-173 
246, 246 

10. Hafer, T., and Thomas, W.: Computation Tree Logic CTL* and Path Quantifiers 
in the Monadic Theory of the Binary Tree, In Automata, Languages and Program- 
ming, 14th International Colloquium, LNCS 267 (1987) 269-279 246 

® A closely related question is whether there is a strict FO* hierarchy on the class of 
ordered finite graphs. 



Implicit Temporal Query Languages: Towards Completeness 257 



11. H.W. Kamp: Tense Logic and the Theory of Linear Order, PhD thesis, University 
of California, Los Angeles (1968) 246 

12. Laroussinie, L., and Schnoebelen, PH: A hierarchy of temporal logics with past, 
TCS, 148, 2, (1995) 303-324 246, 246, 248, 249 

13. Toman, D., and Niwinski, D.: First Order Queries over temporal Databases Inex- 
pressible in Temporal Logic, EDBT (1996) 307-324. 246, 246, 247, 256, 256 



On the Undecidability of Some 
Sub-classical First-Order Logics* 



Matthias Baaz^, Agata Ciabattoni^, Christian Fermiiller^, and Helmut Veith^ 

^ Technische Universitat Wien, Austria 
^ Universita di Milano, Italy 



Abstract. A general criterion for the undecidabily of sub-classical first- 
order logics and important fragments thereof is established. It is applied, 
among others, to Urquart’s (original version of) C and the closely related 
logic C*. In addition, hypersequent systems for (first-order) C and C* 
are introduced and shown to enjoy cut-elimination. 



1 Introduction 

A wide range of non-classical logics can be viewed as sub-classical. By this we 
mean that they are based on (a subset of) the signature of classical logic and 
can be extended to classical logic. Intuitionistic logic, fragments of linear logic, 
Lukasiewicz logic, Godel/Dummett logic (in fact, all intermediate logics, and 
most substructural or many- valued logics), are just a few prominent examples. 

(Un)decidability issues are a fundamental topic of Logic in the context of 
Theoretical Computer Science. In the literature there are results on the decision 
problem for particular first-order sub-classical logics. For instance, in [14,11,10] 
the decision problem for some substructural logics, that is logics that when 
formulated as a Gentzen system do not have all the structural rules, has been 
investigated. It was proved that removing the weakening rule from standard 
Gentzen-style formulations of classical or intuitionistic first-order logic gives a 
logic that is still undecidable. On the other hand, removing the contraction rule 
from a sequent calculus for classical or intuitionistic logic results in a decidable 
logic. However it is clear that the lack of contraction in itself is not sufficient for 
the decidability of a sub-classical logic, as witnessed, e.g., by Lukasiewicz’s and 
other important “fuzzy logics” (see [8]). 

In this paper we provide a sufficient and quite general criterion for the un- 
decidability of first-order sub-classical logics, with or without function symbols 
(in fact: already fragments thereof). Moreover, we continue the proof theoretical 
investigation — initiated in [3] — of the logics C (introduced in by A. Urquhart 
in [16]^, see also [13]), and the related C*. These are underlying the most impor- 
tant formalizations of fuzzy logic, namely Godel, Lukasiewicz and Product logic 

* Partly supported by COST-Action No. 15, FWF grant P-12652-MAT and WTZ 
Austria-Italy 1999-2000, Project No. 4 

^ Recently A. Urquhart discovered that his axiom system for C as given in [16] is 
incomplete with respect to an intended semantics [17]. Because of its proof theoretic 
interest we — like others, e.g. [13] — will deal with the original system C here. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 258—268, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



On the Undecidability of Some Sub-classical First-Order Logics 



259 



(see, e.g., [8]). C and C* lack contraction, i.e., D (A D B)] D {A D B) does 
not hold. Moreover, they share the property that their truth values are linearly 
ordered, that is they satisfy the linearity axiom {AZ)B)\/{BZ>A). 

In [3] we introduced propositional cut-free hypersequent calculi for C and 
C*. This allowed us to establish that derivability in these logics is decidable. In 
this paper we consider the first-order version of these logics and define analytic 
(cut-free) calculi for them. The analyticity of the logics C and C* is contrasted 
with their undecidability, which is a corollary to the general undecidability result 
mentioned above. It should also be seen in light of the fact that their purely 
intuitionistic counterparts 1“ and I“* — obtained by dropping the linearity axiom 
from C and C*, respectively — are decidable. 

2 Basic Definitions 

We investigate logics that are based on the language of classical first-order logic 
without equality. More precisely, we use the binary connectives V, A, D and the 
truth constant T. Negation is defined, as usual, by: ~^A := {A D T). 

Remark 1. Treating negation as a defined connective is just a matter of conve- 
nience. Explicit axioms and rules for negation could easily be added in all our 
cases. However — as witnessed by intuitionistic logic — the other connectives and 
the quantifiers will not be interdefinable in general. 

Object variables are denoted by a:, y, ... . The quantifiers V (for all) and 3 (exists) 
refer to these variables. Moreover, there is an infinite supply of n-ary predicate 
symbols for every n > 0. We consider both, logics with and without function 
symbols. In the former case an infinite supply of n-ary function symbols and 
constants is assumed for every n > 0. Constants are considered as 0-ary function 
symbols. Terms and Formulas are inductively defined in the usual way. Classical 
first-order logic with function symbols is denoted as CL/, without function 
symbols as CL. 

Here, we assume logics to be specified by Hilbert-style systems. Axioms are 
always considered as schemata. (I.e., all instances are axioms, too.) Rules of 
derivation are written in the form 

h Ai, . . . , h A„ h B. 

A logic C is identified with the set of provable formulas. We write \~cF for E S C. 

Definition 1. A logic C (with or without function symbols) is called sub-classical 
if it can be extended to classical logic. More formally: if\~cF implies bcL/A. 

Definition 2. A rule h Ai, . . . , h A„ \- B is admissible in logic C if \~cB' 
whenever \~cA'i for all 1 < i < n, where B' and A' are corresponding instances 
of the schematic formulas of the rule. 

We write a[F/p] for the formula that arises by instantiating the formula vari- 
able p of the schema a[p]. 



260 



Matthias Baaz et al. 



3 Two General Undecidability Results 

We establish sufficient conditions that allow to embed (undecidable fragments 
of) classical logic into sub-classical logics. As a central notion for the “simulation 
of two-valued-ness” in non-classical contexts we introduce the following: 

Definition 3. A schema of form a[p] Va[p] with exactly one formula variable p 
is called a duality principle of a logic C if for all formulas F: \- c,cs\F/p] \/oi[F/p] 
hut there exists a formula G such that f-c\^a[G/p] and )!-cifoi[G /p]. 

For the undecidabilty proof below we will assume admissibility^ of the following 
rules for disjunction: 

(weak) h A V (B V C) h (A V B) V C (assoc-l) 

h A V B => h B V A (comm) h (A V B) V (7 ==> h A V (B V C) (assoc-r) 

h A V A h A (idem) 

The following rule expresses a weak form of distributivity: 

hAvC, hBvC=i>l-(AAB)vC (distr) 

A “minimal” rule for existential quantification is: 

h Aft) => h 3x\ A{x) (3-in’) 



where t is any term. 

Definition 4. We call a rule h Ai, . . . , h A„ h B V-normal in a logic C if 
its admissibility in C implies the admissibilty o/ h Ai V C, . . . , h A„ V G h 
B\JG 



Theorem 1. Any sub-classical logic C with function symbols that has a duality 
principle and in which the rules weak, assoc-l, assoc-r, comm, idem, distr, 3-in' 
are admissible and V -normal is undecidable. 

Proof. The proof proceeds as follows: we consider any classical formula F in 
conjunctive normal form and apply Herbrand’s theorem. By using the duality 
principle we can translate the resulting Herbrand instance Fner into a formula 
d{FHer) such that FcL/U/e^ if and only if c^{FHer)- This relation transfers to 
F and 5{F). Hence follows that £ is undecidable. 

It is well known that validity (provability) of formulas of form 

F = 3x: \J f\ Li,^{x) 

l<i<m 

where the are literals and a; is a vector of variables, is undecidable in 
classical logic with function symbols. 

^ Remember that admissibility of rules is a much weaker condition than the deriv- 
ability of corresponding formulas. E.g., the admissibility of idem in £, in general, 
does not imply \-c{A V A) 3 A. 



On the Undecidability of Some Sub-classical First-Order Logics 



261 



Assume that \-ci,fF . By Herbrand’s theorem there exist vectors of ground 
terms such that also the Herbrand instance 

F„.r= V V A 

!<£<k l<i<m 

is provable. 

A disjunction Vi<i<(fcm) called a path of Fner if each of the literals Li 
occurs in exactly one of the “clauses” Ai<j<mi of Fner- A pair of 
literals of the form (P{t), ~^P{t)) is called complementary. It is easy to check that 
\~CLfFHer iff every path tt of Fner contains a complementary pair (P(t), ^P(t)), 
i.e. if both P{t) and occur as disjuncts in tt. 

Let a[p] V a[p\ be a duality principle of L. Consider the following transla- 
tion i5(.) of literals: 

5{P) =a[P/p] 

5(^P)= a[P/p] 

The translation extends homomorphically to arbitrary formulas. 

We first show that the classical provability of Fner implies that \~c5{FHer)- 
Let (P{t),^P{t)) be the complementary pair of a path tt of Fner- By definition 
we have 5{P{t) V ^P{t)) = S{P{t)) V S{^P{t)) = a[P{t)/p] V a[P{t)/p], and 
therefore \~cS{P{t) V ^P{t)). Using rule weak we can add all the other literals of 
7T disjunctively to S{P{t) V ^P{t)) in C. By the admissibility of the rules assoc-l, 
assoc-r, comm and the fact that they are V-normal in C we can reorder the 
disjuncts as needed. After having derived (in C) the formulas corresponding to 
every path of Fuer, we join them conjunctively using rule distr. We thus have 

Now since rule 3-in' is admissible and V-normal we can re-introduce the existen- 
tial quantifiers in front of the inner disjunction. I.e. we obtain 

y 3x: \/ /\ S{L,,,{x)). 

l<i<k l<i<m 

Observe that we used the same variables as in the original formula F. Therefore 
all k disjuncts are identical. Since rule idem is admissible and V-normal we can 
contract this to 

^cS{F). 

We thus have proved: L cl fF implies \-cS{F). 

For the other direction, observe that F£<5(F) implies Fcl/< 5(A) since £ is 
sub-classical. It remains to check that 6{F) is equivalent to F in classical logic 
with respect to provability. For this observe that, since p is the only formula 
variable of the duality principle, a[p] is classically equivalent either to p or to 
^p. (It cannot be constantly equivalent to T by /rcLOi[G/p], for some G. It 
also cannot be constantly equivalent to T since all instances of a[p] V a[p] are 
provable in £ and therefore also in classical logic.) The same holds for a[p\. Since 
all instances of a[p] V a[p] are provable in £ and therefore also in classical logic. 




262 



Matthias Baaz et al. 



we conclude that 

either: a[p] = p and a[p] = ~^p, 
or: a[p] = -^p and a[p] = p. 

where = denotes classical equivalence. Both ways the duality principle is classi- 
cally equivalent to p V Summarizing, \~cS{F) implies \~cLfF, qed. 

From the proof of Theorem 1 we immediately obtain: 

Corollary 1. The existential fragments of logics fullfilling the conditions stated 
in the previous theorem are undecidahle. 

To obtain a useful criterion for undecidability of logics without function symbols 
we need to consider also the universal quantifier. Consider the standard rule for 
universal quantification: 

A{x) Wx'. A{x) (V-in') 



Remark 2. We require that, if rule V-in’ is V-normal, the eigenvariable condition 
is obeyed in its application. I.e., in 

A{x) V C h (V®: A(x)) V C (V-in’) 

X must not occur in C. 

If, in addition to the assumptions of Theorem 1 we require V-in’ to be admissi- 
ble and V-normal (including the eigenvariable condition) we obtain undecidablity 
for logics without function symbols. 

Theorem 2. Any sub-classical logic C that has a duality principle and in which 
the rules weak, assoc-l, assoc-r, comm, idem, distr, 3-in’, as well as V-in’ are 
admissible and \/ -normal is undecidable, even without function symbols. 

Proof. The proof is analogous to that of Theorem 1. We start with formulas in 
prenex conjuntive normal form: 

F = Qa;i . . . Qa;„: \/ f\ L^j{x), 
where Q € {3,V} and x = xi,. . . ,a;„. 

Since we have no function symbols we apply Herbrand’s theorem in its orginal 
form (i.e., without Skolemizing) . In the Herbrand instance 

Fner= V V A 

l<i<k l<i<m 

the variables Xi are replaced by variables p) ' in such a way that introducing 
the quantifiers in a proof of F given Fuer, can be done without violating the 
eigenvariable condition. Again the proof can be transferred to the logic C using 
the duality principle and the fact that the stated rules are admissible and V- 
normal. (Note that it is essential that V-in’ is applied in its V-normal variant, 
obeying the eigenvariable condition.) The rest of the proof proceeds exactly as 
in Theorem 1. I.e., we have FclF if only if h£i5(F). 



On the Undecidability of Some Sub-classical First-Order Logics 



263 



4 First-Order Extensions of Urquhart’s (Original) C 

In [3] we investigated some propositional logics underlying the most important 
formalizations of fuzzy logic [8]. Here we focus attention to the extensions of 
these logics to the first-order level. 

Our logics are defined by their Hilbert-style axiomatizations. We refer to the 
following list of axioms (cf. [3]): 



Axl: 


Ad{BdA) 




Ax2: 


{AdB)d ](G 


D A)d {C D B)] 


Ax3: 


[Ad {CD B)] 


3 [(G 3 (A 3 B)] 


Ax4: 


{AaB)d A 




Ax5: 


{AaB)d B 




Ax6: 


Ad[Cd{Aa 


G)] 


Ax7: 


Ad {Aw B) 




Ax8: 


B D {Aw B) 




Ax9: 


{Adb)d [{c 


3 B) 3 [(AVG) 3 



Rest : [(A A B) D O] D [H D (B 3 C)] 
Res2 : [(A 3 (B D C)] 3 [{A A B) D C] 
Lin : (H 3 B) V (B 3 H) 

Com : A {Ad B)] 3 [B A (B 3 A)] 

Abs : L D A 

Contr -.[Ad {Ad B)] D {Ad B) 

Dneg : -i-iH 3 A 



Taking Modus Ponens as the only rule of derivation we define the propositional 
fragments of the logics I“, 1“*, C, and C* by the following sets of axioms^: 

I” : {Axl , . . . , Ax9, Abs} C : I” U {Lin} 

I-* : I- U {Best, Res2} C* : I"* U {Lin} 

The first-order logics are given by adding to the corresponding propositional 
systems the axioms: 

AxV :yx:A{x) 3 A{t) Ax3 ■ A{t) 3 3x: A{x) 

and the rules 

h B 3 A{x) =4> h B 3 Va;: A(a;) (V-in) h A{x) 3 B => h 3®: A{x) 3 B (3-in) 

The quantifier axioms and rules are subject to the usual variable conditions: t 
is free for x and x does not ocur free in B. 

C coincides with Urquhart’s “basic” many- valued logic as defined in [16] (see 
also [13]). I“* extends to intuitionistic logic I by adding Contr (contraction). By 
adding to I“* the axiom Dneg (involutivity of negation) we obtain the affine 
multiplicative fragment of linear logic [6] aMLL (extended with the additive 
disjunction). Hajek’s Basic Logic BL [8] corresponds to the logic obtained by 
adding axiom Com to C* (as proved in [3]). If we add to BL axiom Dneg we get 
Lukasiewicz logic [12]; while by extending BL with the axioms 3 ((BA A 3 
C A A) 3 (B 3 C)) and A A ^A 3 T we obtain Product logic [9]. C, C*, and 
BL all turn into Godel logic Goo [7,5] if we add axiom Contr. 

In [3,1] analytic — i.e., cut-free Gentzen-style — calculi for the propostional ver- 
sions of the logics I“, 1“*, C, C*, and Goo have been introduced. These calculi 
are called LJ“, LJ“*, HG, HG* and GLG, respectively. As a consequence one 

In fact, Rest is redundant in presence of Axl, Ax2, Ax3. 



3 



264 



Matthias Baaz et al. 



obtains that the derivability problem for these logics is decidable and is at most 
in PSPACE. HC, HC* and GLC are based on hypersequents — a simple and 
natural generalization of Gentzen sequents to mulitsets of sequents (see [2] for 
an overview). 

The axioms and rules of the calculus HC are as follows: 

, . Cut Rule: 

Axioms: 



A\- A _L h A 



G I A h A G' \ A, T2\- B 
G I G' I ri,A b B 



(cut) 



Structural Rules: 



G I r h G 
G I A A I- G 



(W) 



G I r h A 
G I r I- A I A I- A' 



(EW) 



G I r h A I r h A 
G I r I- A 

Logical Rules: 



(EC) 



G|Ai,AbA G'|A2,AbB 

(Comm) 

G I G' I Ai,A 2 h A I A, A b B 



G I A A b B 
G I B b A 3 B 



(3 -right) 



G I A b A G' I B, B2 b G 
G I G' I A,B2,A3BbG 



(3 -left) 



G I A b A G' I A b B 
G I G' I A,B2 b AaB 



(A-right) 



G I r,Ai b G 
G I A Al A A2 b G 



(Ai-left) 



G\ Bh Ai 
G I B b Al V As 



(Vi-right) 



G I B, A b G G' I B, B b C 
G I G' I B, A V B b G 



(V-left) 



The calculus HC* is obtained by substituting in the above calculus the 
(Ai— left) rule with the following one: 



G I B, A, B b C 
G I B, A A B b G 



(A-left) 



The calculus CLC (see [1]) is obtained by simply adding the contraction rule 
to HC or HC*. The first-order calculi HC, HC*, and CLC are obtained by 
adding to the corresponding calculi, the following rules: 



G 


1 r b F{a) 


(V-right) 


G 


1 B,B(t)bG 


(V-left) 


G| 


r b \/xF{x) 


G| 


r,'ixF{x) b G 


G 


1 r b F{t) 


(3-right) 


G 


1 B,B(a)bG 


(3-left) 


G| 


r b 3xF{x) 


G| 


B, 3xF{x) b G 



where — in the rules (V-right) and (3-left ) — a does not occur in the lower 
hypersequent. 

The correctness and completeness results for our hypersequent calculi (with 
respect to the Hilbert style systems) are easily extended from the propositional 
level (see [3]) to the first-order level. 



On the Undecidability of Some Sub-classical First-Order Logics 265 

Theorem 3. Provability of “ \~ F” in HC, HC*, and GLC coincides with 
provability of “F” in C, C* , and Goo, respectively. 



Corollary 2. (1) The existential fragments and of the logics C, C*, and Goo 

with function symbols are undecidable. (2) The prenex fragments of these logics 
without function symbols are undecidable. 

Proof. It is easy to check that the rules weak, assoc-l, assoc-r, comm, idem, distr, 
3-in' are both: admissible and V-normal, in the mentioned logics. 

The schema {p D ~^p) V {^p D p) , which is an instance of the linearity axiom is 
a duality principle for C, C*, and Goo- Indeed, by using Avron’s Communication 
rule (Comm) it can be derived in the corresponding calculi (see [1,3]) but neither 
p D ^p nor ~^p D p is provable in classical logic. Therefore the statements (1) 
and (2) follow from Theorems 1 and Theorem 2, respectively. 

This should be contrasted with the fact that decision procedures for I” 
and I“* can directly be extracted from corresponding cut-free calculi as proved 
in [14]. 

Finally, we remark that our general undecidability results also apply to 
(other) important fuzzy logics: 

Corollary 3. Also for BL, Lukasiewicz and Product logic the existential frag- 
ment with function symbols and the prenex fragments without function symbols 
are undecidable. 



5 Analyticity and a Decidable Subclass 

Proposition 1. The first-order calculi LJ~, LJ~* and LJ admit cut- 
elimination. 

Proof. (Sketch) For the sequent calculus LJ for intuitionistic logic, this is well- 
known. An inspection of the classical proof shows that the absence of contraction 
does not affect cut-elimination. 

In [3] ([!]) it was proved that the hypersequent calculi HC and HC* (respec- 
tively, GLC) admit cut-elimination. In order to extend these results to the first- 
order level we need the following “eigenvariable” lemma. (The straightforward 
proof is omitted for space reasons.) 

Lemma 1. Let V{a) be a proof in the first-order calculus HC (HC*, GLCj 
of the hypersequent S that contains the variable a. Lf, throughout the proof, we 
replace a by a term t, containing only variables that do not occur in V(a), we 
obtain a proof V ft) ending with S[t/a]. 



Theorem 4. First-order HC, HC* and GLC admit eut- elimination. 



266 



Matthias Baaz et al. 



Proof. It is enough to show that if 7^ is a proof of a hypersequent H containing 
only one cut rule which occurs as the last inference of V, then H is derivable 
without the cut rule. 

Cut-elimination for hypersequent calculi works essentially in the same way 
as for the corresponding sequent calculi. As was noticed in [4] , a simple way to 
make the inductive argument work in the presence of the (EC) rule, is to consider 
the number of applications of this rule in a given derivation as an independent 
parameter. The proof will proceed by induction on lexicographically-ordered 
triple (r, c, /i), where r is the number of the applications of the (EC) rule in the 
proofs of the premises of the cut rule, c is the complexity of the cut formula, 
and h is the sum of the length of the proofs of the premises of the cut rule. 

It suffices to consider the following cases according to which inference rule is 
being applied just before the application of the cut rule: 

1. either G \ P \- A or G' \ P',A \- B is an initial hypersequent; 

2. either G \ P \- A or G' \ P', A\- B is obtained by a structural rule; 

3. both G \ r \- A and G' \ P', A\- B are lower sequents of some logical rules 
such that the principal formulas of both rules are just the the cut formulas; 

4. either G \ P A or G' \ P' , A h B is a lower sequent of a logical rule whose 
principal formula is not the cut formula. 

We will give here a proof for some relevant cases. Suppose that the last in- 
ference in the proof of one premise of the cut is the (EC) rule, and the proof 
ends as follows: 



G \ P'r A \ P'r A 



G \ P'r A 



(EC) 



G' I A, r' h B 



G\G' \P',P'r B 



(cut) 



Let r' be the number of applications of the (EC) rule in the above proof. 
This proof can be replaced by 



G|ri-A|ri-A G'|A,r'i-B 
(cut) 

G I G' I r, r' I- B I r I- A g' | a, r' h b 

(cut) 

G I G' I G' I r',B I- B I r',B I- B 

(EC) 

G I G' I r', B h B 

which contains two cuts with r' — 1 applications of the (EC) rule. Then, they 
can be eliminated by induction hypothesis. 

Cases concerning the remaining structural rules and the logical ones are 
treated as in the corresponding proofs of the cut-elimination theorem for the 
hypersequent calculi HC,HC* and GLC (see [3]). In the following we show 
how to eliminate a cut involving the rules for quantifiers. 



On the Undecidability of Some Sub-classical First-Order Logics 



267 



Suppose that the last inference in the proof of one premise of the cut is the 
(3-left) rule, and the proof, say V, ends as follows: 



(3-left) 

G\3xB{x),r^A G' \ E,D{a),Ah C 

(cut) 

G\G' \ 3xB{x),r,E,D{a) h C 

Let V{a) be the proof ending with G' | S, D{a), A h C. By Lemma 1, 
by replacing a by 6 throughout V{a) one gets a proof ending with 
G'[6/a] I S[b/a],D{a),A h C Thus the proof V can be replaced by 



G I B(a),r h A G'[b/a\ \ E[b/a], D{b), AV- C 

(cut) 

G I G'[b/a\ I B{a),r,E[h/alD{b) h G 

(3-left) 

G I G'[b/a] I 3xB{x),r,E[b/a],D{b) h G 

in which the cut is shifted upward, so it can be eliminated by induction 
hypothesis. 

Suppose that both the premises of the cut rule are lower sequents of the rules 
for the quantifier 3 and that the cut formula is the principal formula of both 
rules, that is 



G\r^B{t) G'\E,B{a)^G 

(3-right) — 

G\r^3xB{x) G' \ E,3xB{x)h C 



G\G' \r,E^c 



(3-left) 

(cut) 



Let V{a) and Q{a) be the proofs ending with G \ F \- B{a) and G' \ E, 
B(t) h G, respectively. By repeatedly applying Lemma 1 one can replace the 
previous proof by the following one: 



ri-B(t) r,B(t)l-G 

(cut) 

r,ri- G 

that contains a cut whose complexity of the cut formula is smaller than that 
one of the cut in the previous proof. Then it can be eliminated by induction 
hypothesis. 

Cases involving the quantifier V can be treated similarly. 

As an easy consequence of cut-elimination we obtain the decidabilty of the 
fragments of C, C*, and Goo, respectively, where quantifiers occur only strongly: 

Corollary 4. Let FI consist of sequents Fi h G,, where the formulas of Fi are 
purely existential and Ci is purely universal. The derivahility of FI in HC, HC*, 
as well as GLC is decidable. 



268 



Matthias Baaz et al. 



References 

1. A. Avron. Hypersequents, logical consequence and intermediate logics for concur- 
rency. Annals for Mathematics and Artificial Intelligence, 4(199) :225-248, 1991. 
263, 264, 265, 265 

2. A. Avron. The method of hypersequents in the proof theory of propositional 
nonclassical logics. In Logic: from Foundations to Applications, European Logic 
Colloquium, pages 1-32. Oxford Science Publications. Clarendon Press. Oxford, 
1996. 264 

3. M. Baaz, A. Ciabattoni, C. Fermiiller, and H. Veith. Proof theory of fuzzy logics: 
Urquhart’s C and related logics. In Mathematical Foundations of Computer Science 
1998, 23rd International Symposium, MFCS’98, Brno, Czech Republic, August 2f- 
28, 1998, Proceedings, volume 1450 of LNCS, pages 203-212. Springer- Verlag, 1998. 
258, 259, 263, 263, 263, 263, 264, 265, 265, 266 

4. A. Ciabattoni. Bounded contraction in systems with linearity. In N. Murray, editor. 
Automated Reasoning with Analytic Tableaux and Related Methods, International 
Conference, TABLEAUX’99, Saratoga Springs, volume 1617 of LNAI, pages 113- 
128. Springer- Verlag, 1999. To appear. 266 

5. M. Dummett. A propositional calculus with a denumerable matrix. Journal of 
Symbolic Logic, 24:96-107, 1959. 263 

6. J.Y. Girard. Linear logic. Theoretical Computer Science, 50:1-101, 1987. 263 

7. K. Godel. Zum intuitionistischen Aussagenkalkiil. Anzeiger der Akademie der 
Wissenschaften in Wien, 69:65-66, 1932. 263 

8. P. Hajek. Metamathematics of Fuzzy Logic. Kluwer, 1998. 258, 259, 263, 263 

9. P. Hajek, L. Godo, and F. Esteva. A complete many-valued logic with product- 
conjunction. Archive for Math. Logic, 35:191-208, 1996. 263 

10. E. Kiriyama and H. Ono. The contraction rule in decision problems for logics 
without structural rules. Studia Logica, 50/2:299-319, 1991. 258 

11. Y. Komori. Predicate logics without structural rules. Studia Loqica, 45:393-404, 
1985. 258 

12. J. Lukasiewicz. Philosophische Bemerkungen zu mehrwertigen Systemen des Aus- 
sagenkalkiils. Comptes rendus des seances de la Societe des Sciences et des Lettres 
de Varsovie Cl. Ill, 23:51-77, 1930. 263 

13. J.M. Mendez and F. Salto. Urquhart’s C with intuitionistic negation: Dummett’s 
LC without the contraction axiom. Notre Dame Journal of Formal Logic, 36/3:407- 
413, 1995. 258, 258, 263 

14. H. Ono and Y. Komori. Logics without the contraction rule. Journal of Symbolic 
Logic, 50/1:169-201, 1985. 258, 265 

15. G. Takeuti. Proof Theory. North-Holland, Amsterdam, 2nd edition, 1987. 

16. A. Urquhart. Many-valued logic. In D. Gabbay and F. Guenthner, editors, Hand- 
book of Philosophical Logic, volume III. Reidel, Dordrecht, 1984. 258, 258, 263 

17. A. Urquhart. Basic Many-valued logic. Manuscript, private communication, 1999. 
258 



How to Compute with DNA* 



Lila Kari^, Mark Daley^, Greg Gloor^, Rani Siromoney^, and 
Laura F. Landweber^ 

^ Dept, of Computer Sci., Univ. of Western Ontario, London, ON N6A 5B7 Canada 
lilaOcsd . uwo . ca, www . csd . uwo . ca/'lila 
daleyOcsd . uwo . ca, www . csd . uwo . ca/~daley 
^ Dept, of Biochemistry, Univ. of Western Ontario, London ON N6A 5C1 Canada 
ggloor® Julian. uwo . ca, www.biochem.uwo . ca/f ac/'gloor 
® Dept, of Computer Sci., Madras Christian College, Madras 600 059 India 
ranisiro@satyam.net . in 

Dept, of Ecology & Evolutionary Biology, Princeton Univ., NJ 08544-1003 USA 
If l@princeton . edu, www . princeton . edu/'lf 1 



Abstract. This paper addresses two main aspects of DNA computing 
research: DNA computing in vitro and in vivo. We first present a model 
of DNA computation developed in [5]: the circular insertion/deletion sys- 
tem. We review the result obtained in [5] stating that this system has 
the computational power of a Turing machine, and present the outcome 
of a molecular biology laboratory experiment from [5] that implements 
a small instance of such a system. This shows that rewriting systems 
of the circular insertion/deletion type are viable alternatives in DNA 
computation in vitro. In the second half of the paper we address DNA 
computing in vivo by presenting a model proposed in [17] and developed 
in [18] for the homologous recombinations that take place during gene re- 
arrangement in ciliates. Such a model has universal computational power 
which indicates that, in principle, some unicellular organisms may have 
the capacity to perform any computation carried out by an electronic 
computer. 



1 Introduction 

Electronic computers are only the latest in a long chain of man’s attempts to use 
the best technology available for doing computations. While it is true that their 
appearance, some 50 years ago, has revolutionized computing, computing does 
not start with electronic computers, and there is no reason why it should end 
with them. Indeed, even electronic computers have their limitations: there is only 
so much data they can store and their speed thresholds determined by physical 
laws will soon be reached. The latest attempt to break down these barriers is 
to replace, once more, the tools for doing computations: instead of electrical use 
biological ones.^^^l 

Research in this area was started by Leonard Adleman in 1994, [1], when 
he surprised the scientific community by using the tools of molecular biology to 
solve a hard computational problem. Adleman’s experiment, solving an instance 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 269—282, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



270 



Lila Kari et al. 



of the Directed Hamiltonian Path Problem solely by manipulating DNA strands, 
marked the first instance of a mathematical problem being solved by biological 
means. The experiment provoked an avalanche of computer science/molecular 
biology/biochemistry/physics research, while generating at the same time a mul- 
titude of open problems. 

The excitement DNA computing incited was mainly caused by its capabil- 
ity of massively parallel searches. This, in turn, showed its potential to yield 
tremendous advantages from the point of view of speed, energy consumption and 
density of stored information. For example, in Adleman’s model, [2], the number 
of operations per second was up to 1.2 x 10^®. This is approximately 1,200,000 
times faster than the fastest supercomputer. While existing supercomputers ex- 
ecute 10® operations per Joule, the energy efficiency of a DNA computer could 
be 2 X 10^® operations per Joule, that is, a DNA computer could be about 10^® 
times more energy efficient (see [1]). Finally, according to [1], storing information 
in molecules of DNA could allow for an information density of approximately 
1 bit per cubic nanometer, while existing storage media store information at a 
density of approximately 1 bit per 10^^ nm®. As estimated in [3], a single DNA 
memory could hold more words than all the computer memories ever made.^^^i 

A few more words, as to why we should prefer biomolecules to electricity for 
doing computation: the short answer is that it seems more natural to do so. We 
could look at the electronic technology as just a technology that was in the right 
place at the right time; indeed, electricity hardly seems a suitable and intuitive 
means for storing information and for computations. For these purposes, nature 
prefers instead a medium consisting of biomolecules: DNA has been used for 
millions of years as storage for genetic information, while the functioning of 
living organisms requires computations. Such considerations seem to indicate 
that using biomolecules for computations is a promising new avenue, and that 
DNA computers might soon coexist with electronic computers. 

The research in the field has, from the beginning, had both experimental 
and theoretical aspects; for an overview of the research on DNA computing 
see [12]. This paper addresses both aspects. After introducing the basic notions 
about DNA in Section 2, in Section 3 we present a model of DNA computa- 
tion developed in [5]: the circular insertion/deletion system. We show that this 
system has the computational power of a Turing machine and also present the 
results of a molecular biology laboratory experiment that implements a small 
instance of such a system. This shows that rewriting systems of the circular 
insertion/deletion type are viable alternatives in DNA computation in vitro. 
Section 4 introduces DNA computing in vivo by presenting a model proposed 
in [17], [18] for the homologous recombinations that take place during gene rear- 
rangement in ciliates. We prove that such a model has universal computational 
power which indicates that, in principle, some unicellular organisms may have 
the capacity to perform any computation carried out by an electronic computer. 



How to Compute with DNA' 



271 



2 What is DNA? 

DNA (deoxyribonucleic acid) is found in every cellular organism as the storage 
medium for genetic information. It is composed of units called nucleotides, dis- 
tinguished by the chemical group, or base, attached to them. The four bases 
are adenine, guanine, cytosine and thymine, abbreviated as A, G, C, and T. 
(The names of the bases are also commonly used to refer to the nucleotides that 
contain them.) Single nucleotides are linked together end-to-end to form DNA 
strands. A short single-stranded polynucleotide chain, usually less than 30 nu- 
cleotides long, is called an oligonucleotide (or, shortly, oligo). The DNA sequence 
has a polarity: a sequence of DNA is distinct from its reverse. The two distinct 
ends of a DNA sequence are known under the name of the 5' end and the 3' 
end, respectively. Taken as pairs, the nucleotides A and T and the nucleotides C 
and G are said to be complementary. Two complementary single-stranded DNA 
sequences with opposite polarity will join together to form a double helix in a 
process called hase-pairing or annealing. The reverse process - a double helix 
coming apart to yield its two constituent single strands - is called melting. 

A single strand of DNA can be likened to a string consisting of a combination 
of four different symbols. A, G, C, T. Mathematically, this means we have at 
our disposal a 4-letter alphabet X = {A,G,C,T} to encode information. As 
concerning the operations that can be performed on DNA strands, the existing 
models of DNA computation are based on various combinations of the following 
primitive bio-operations, [12]: 

- Synthesizing a desired polynomial-length strand. 

- Mixing: pour the contents of two test-tubes into a third. 

- Annealing (hybridization): bond together two single-stranded complementary 
DNA sequences by cooling the solution. 

- Melting (denaturation): break apart a double-stranded DNA into its single- 
stranded components by heating the solution. 

- Amplifying (copying): make copies of DNA strands by using the Polymerase 
Chain Reaction (PCR). 

- Separating the strands by size using a technique called gel electrophoresis. 

- Extracting those strands that contain a given pattern as a substring by using 
affinity purification. 

- Cutting DNA double-strands at specific sites by using commercially available 
restriction enzymes. 

- Ligating: paste DNA strands with compatible sticky ends by using DNA ligases. 

- Substituting: substitute, insert or delete DNA sequences by using PCR site- 
specific oligonucleotide mutagenesis. 

- Detecting and Reading a DNA sequence from a solution. 

The bio-operations listed above and possibly others, will then be used to 
write “programs” which receive a tube containing DNA strands as input and 
return as output a set of tubes. A computation consists of a sequence of tubes 
containing DNA strands. 

For further details of molecular biology terminology, the reader is referred 
to [12], [16]. 



272 



Lila Kari et al. 



3 How to Compute with DNA: Circular Insertions and 
Deletions 

One of the aspects of the theoretical side of the DNA computing research com- 
prises attempts to find a suitable model and to give it a mathematical foundation. 
This aspect is exemplified below by the circular contextual insertion/deletion 
system, [5] a formal language model of DNA computing. We mention the re- 
sult obtained in [5] that the circular insertions/deletion systems are capable of 
universal computations. We also give the results of an experimental laboratory 
implementation of our model. This shows that rewriting systems of the circular 
insertion/deletion type are viable alternatives in DNA computation. 

Insertions and deletions of small circular strands of DNA into/from long 
linear strands happen frequently in all types of cells and constitute also one of 
the methods used by some viri to infect a host. We describe here a generalization 
of insertions and deletions of words, [11], that aims to model these processes. 
(Note that circular DNA strings have been studied in the literature in the context 
of the splicing system model in [8], [9], [21], [24], [25].)t®l 

In order to introduce our model, we first need some formal language defini- 
tions and notations. Throughout this paper, X represents an alphabet (a finite 
nonempty set), A represents the empty word (the word containing 0 letters), 
represents a circular string v (a set containing every circular permutation of the 
linear string v). The length of a word v, denoted by |u|, is the number of occur- 
rences of letters in v, counting repetitions. For a language L, by uL we denote 
the set of all words uv where v € L. For further formal language definitions and 
notations the reader is referred to [22], [23]. 

In the style of [15], we define a circular insertion/deletion system, [5], as a 
tuple 

W = {X,T,r,D,A) 

where X is an alphabet, card(A) > 2, T C A is the terminal alphabet, /* C 
(A*)® is the finite set of circular insertion rules, D C (A*)^ is the finite set of 
deletion rules, and A C A+ is a linear strand called the axiom. 

A circular insertion rule in /* is written as (ci, gi, ( 72 , C 2 )/ where (ci,C 2 ) 
represents the context of the insertion, »x is the string to be inserted and (gi, 32 ) 
are the guides, i.e. the location where »x is cut. 

Given the rule above, the guided contextual circular insertion of the circular 
string •a; into a linear string u is performed as follows. The circular word ux is 
linearized by cutting it between gi and 52 (provided gig 2 occurs as a subword 
in x) and reading it clockwise starting from gi and ending at g 2 - The resulting 
linear strand is then inserted into the linear word u, between ci and C 2 . If C 1 C 2 
does not occur as a subword in u no insertion can take place. An example of 
circular insertion is illustrated in Figure 1. 

A deletion rule in D is written as {c\,x,C2)d where (ci,C 2 ) represents the 
context of deletion and x is the string to be deleted. 



How to Compute with DNA' 



273 




Fig. 1. Graphical representation of a circular insertion in the context (x,y), 
where the circular string is cut at the site (A,B). (From [5].) 



Given the rule above, the linear contextual deletion of x from a linear word u 
accomplishes the excision of the linear strand x from u, provided x occurs in u 
flanked by ci on its left side and by C 2 on its right side. 

If M, u G X*, we say that u derives v according to ID* and we write u ^ v, [5], 
if V is obtained from u by either a guided contextual circular insertion or by a 
linear contextual deletion, i.e., 

- either u = ac\C2P, v = acigix' g2C2P and /* contains the circular insertion 
rule {ci,gi,ux,g 2 ,C 2 )i where gix'g 2 € »x, or 

-u = acixc2P, V = ac\C2fi and D contains the linear deletion rule (ci , x, 02)0- 
A sequence of direct derivations 



Ui U2 ^ Ufc, fc > 0 

is denoted by ui =^* Uk and Uk is said to be derived from ui. 

The language L{ID*), [5], accepted by the circular insertion/deletion system 
ID* is defined as 



L{ID*) = {v £ T*\ V A, A is the axiom } 

Recall that, [23], a rewriting system {S,X U {#},F) is called a Turing ma- 
chine iff the following conditions are satisfied. 

(i) S and X U {#} (with ^ X and A yf 0) are two disjoint alphabets 
referred to as the state and tape alphabet. 

(ii) Elements sq G S', t> € A, and a subset Sf C S are specified, namely, 
the initial state, the blank symbol, and the final state set. A subset Fy C A is 
specified as the final alphabet. 

(iii) The productions in F are of the forms 

(1) Sia Sjb overprint 

(2) SiUc — > asjC move right 

(3) SiO# — > asj\>=ff move right and extend workspace 

(4) cSiO ^ Sjca move left 

(5) ^ =ffsjba move left and extend workspace 



274 



Lila Kari et al. 



where Si, sj € S and a,b,c £ X . Furthermore, for each Si, Sj £ S and a £ X, F 
either contains no productions (2) and (3) (resp. (4) and (5)) or else contains 
both (2) and (3) (respectively (4), (5)) for every c£ X. For no € S' and a £ X, 
the word s^a is a subword of the left side in two productions of the forms (1), 
(3) and (5). 

We say that a word sw, where s £ S and w £ {X Li {#})* is final iff w 
does not begin with a letter a such that sa is a subword of the left side of some 
production in F. The language accepted by a Turing machine TM is defined by 

L{TM) = {w £ Vf\ =ffsQW=ff =^* #wiSfW 2 # for some 

Sf £ Sf,w\,W 2 £ X* such that SfW 2 # is final} 

where denotes derivation according to the rewriting rules (1) - (5) of the 
Turing machine. A language is acceptable by a Turing machine iff L = L(TM) 
for some TM. It is to be noted that TM is deterministic: at each step of the 
rewriting process, at most one production is applicable. 

The following result proved in [5] shows that the circular insertion/deletion 
systems defined above have the computational power of a Turing machine. 

Theorem 1. If a language is acceptable by a Turing machine TM, then there 
exists a circular insertion/ deletion system ID* accepting the same language. 

To test the empirical validity of our theoretical model, we implemented, [5], 
a small circular insertion/deletion system in the laboratory. The purpose of this 
implementation was to show that in vitro circular insertion is possible and not 
overwhelmingly difficult . 

The following circular insertion/deletion system was chosen: 

ID* = {X,T,I*,D,u) 

where the alphabets are T,X = {A,C,G,T}, there are no deletion rules, i.e. 
D = ib, the axiom u is a small DNA segment from the Drosophila Melanogaster 
genome and I* = (G, G, »v, TCGAC, TGGAG) where is a commercially avail- 
able plasmid (circular strand). Note that A, G, G, T correspond to the four bases 
that occur in natural DNA, and that the sequence GjTGGAG is the restriction 
(cut) site for the Sal I enzyme. 

To begin the experiment, we synthesized the linear axiom u in which we 
would then insert. This was accomplished by taking DNA from Drosophila (fruit 
fly) and performing PCR with the primers BG~^ and cd~ . The result was the 
amplification of a particular 682bp (basepair) linear sequence of DNA which 
became the axiom u of the circular insertion/deletion system. The 682bp linear 
strand was chosen to contain exactly one restriction site for the enzyme Sal /, 
corresponding to the context of insertion {G, TGGAG). For the circular string 
•V to be inserted we chose pK18h, a commercially available plasmid having 
one restriction site for Sal I, corresponding to the guides (G, TGGAG) in the 
insertion rule. I®! 



How to Compute with DNA' 



275 



After verifying that the PCR had worked correctly and we had indeed ob- 
tained the desired 682bp linear axiom m, we cut u with Sal I, cleaving it into two 
new linear strands denoted by L and R, i.e. u = L|i?. The product was checked 
by gel electrophoresis to ensure the presence of bands corresponding to the sizes 
of L (188bp) and R (493bp), as seen from the first band in the gel of Figure 2. 
The plasmid uv was also cut and linearized in the same fashion resulting in the 
linear strand v. 




Fig. 2. The first vertical lane of this gel consists of bands corresponding to the 
unreacted linearized plasmid v, the linear strand u, and the two fragments of 
the cut linear strand (i?, respectively L). The second vertical lane shows a band 
corresponding to the product obtained after reaction: the insertion of v into u, 
i.e., u ^ V. The third lane contains a standard Ikb (kilobase) ladder used to 
measure the others. (From [-5].) 



At this point the linear strands L and R were combined with the linearized 
pK18h, i.e. v, and ligase was added to reconnect the strands of DNA. After al- 
lowing time for ligation, a gel was run to determine the products. The second 
band from the gel shown in Figure 2 indicates that in addition to the desired 
L|plasmid|i?, we also obtain R\R, L\R, plasmid | plasmid and even plasmid | plas- 
mid I plasmid. 




276 



Lila Kari et al. 



Note that the band corresponding to the approximate size of L|plasmid|i? 
can be seen as a smear. This could suggest the presence of i?|plasmid|i? or of 
any other combination of two linear fragments and a plasmid which failed to 
separate clearly from one another due to the large size. Thus further analysis 
was required to ensure the presence of the desired product L|plasmid|i?. 

In order to amplify the amount of DNA available at this point, the DNA was 
recircularized and introduced into E. Coli bacteria. (The complex details of this 
process are omitted here.) 

Prior to sequencing, a restriction digest was performed on small amounts 
of product isolated from each of the several bacterial colonies. If the starting 
sample were a heterogeneous mixture of DNA molecules, each colony would 
yield a different product. Consequently, the restriction digest of DNA samples 
(each isolated from a particular colony) with enzymes Sal I, Stu I and Xha , 
resulted in bands indicating different size distributions. Of these, one sample 
corresponded to the size of L| plasmid |i? and the identity of the product was 
confirmed by sequencing. 

This experiment demonstrates that it is possible to insert a plasmid into a 
linear strand in vitro, implementing thus a circular insertion/deletion system. 
Future experimental work would ideally include a much larger system to test the 
scalability of this approach. [^1 

4 How do Cells Compute? 

The previous section presented one of the existing models of DNA computa- 
tion and presented a toy experimental laboratory implementation. Despite the 
progress achieved in this direction of research, the main obstacles to creating an 
in vitro DNA computer still remain to be overcome. These obstacles are mainly 
practical, arising from difficulties in coping with the error rates of bio-operations, 
and with scaling up the existing systems. However, note that similar issues of ac- 
tively adjusting the concentrations of reactions and fault detection and tolerance 
are all addressed by biological systems in nature: cells. This leads to another di- 
rection of research, DNA computing in vivo, which addresses the computational 
capabilities of cellular organisms. 

Here we describe a model proposed in [17] and developed in [18] for the 
homologous recombinations that take place during gene rearrangement in cil- 
iates and prove that such a model has the computational power of a Turing 
machine. This indicates that, in principle, these unicellular organisms may have 
the capacity to, perform at least any computation carried out by an electronic 
computer. 

Ciliates are a diverse group of 8000 or more unicellular eukaryotes (nucleated 
cells) named for their wisp- like covering of cilia. They possess two types of nuclei: 
an active macronucleus (soma) and a functionally inert micronueleus (germline) 
which contributes only to sexual reproduction. The somatically active macronu- 
cleus forms from the germline micronucleus after sexual reproduction, during the 
course of development. The genomic copies of some protein-coding genes in the 



How to Compute with DNA' 



277 





1 2 3 4 5 6 7 



In the macronucleus, gene-sized 
chromosomes assemble from 
their scrambled building blocks; 
telomere repeats (boxes) mark 
and protect the surviving ends. 



In the micronucleus, coding 
regions of DNA are dispersed 
over the long chromosome. 




6 2 



4 



Fig. 3. Overview of gene unscrambling. Dispersed coding MDSs 1-7 re- 
assemble during macronuclear development to form the functional gene copy 
(top), complete with telomere addition to mark and protect both ends of the 
gene. (From [17].) 

micronucleus of hypotrichous ciliates are obscured by the presence of intervening 
non-protein-coding DNA sequence elements (internally eliminated sequences, or 
lESs). These must be removed before the assembly of a functional copy of the 
gene in the somatic macronucleus. Furthermore, the protein-coding DNA seg- 
ments (macronuclear destined sequences, or MDSs) in species of Oxytricha and 
Stylonychia are sometimes present in a permuted order relative to their final 
position in the macronuclear copy. For example, in O. nova, the micronuclear 
copy of three genes (Actin I, a-telomere binding protein, and DNA polymerase 
a) must be reordered and intervening DNA sequences removed in order to con- 
struct functional macronuclear genes. Most impressively, the gene encoding DNA 
polymerase a (DNA pol a) in O. trifallax is apparently scrambled in 50 or more 
pieces in its germline nucleus [10]. Destined to unscramble its micronuclear genes 
by putting the pieces together again, O. trifallax routinely solves a potentially 
complicated computational problem when rewriting its genomic sequences to 
form the macronuclear copies. 

This process of unscrambling bears a remarkable resemblance to the DNA al- 
gorithm Adleman [1] used to solve a seven-city instance of the Directed Hamilto- 
nian Path Problem. The developing ciliate macronuclear “computer” (Figure 3) 
apparently relies on the information contained in short direct repeat sequences 
to act as minimal guides in a series of homologous recombination events. These 
guide-sequences act as splints, and the process of recombination results in linking 
the protein-encoding segments (MDSs, or “cities”) that belong next to each other 
in the final protein coding sequence. As such, the unscrambling of sequences that 



278 



Lila Kari et al. 



encode DNA polymerase a accomplishes an astounding feat of cellular compu- 
tation. Other structural components of the ciliate chromatin presumably play a 
signihcant role, but the exact details of the mechanism are still unknown. 

In this section we dehne the notion of a guided recombination system, [18]), 
that models the process taking place during gene rearrangement, and prove that 
such systems have the computational power of a Turing machine, the most widely 
used theoretical model of electronic computers. 

The following strand operations generalize the intra- and intermolecular re- 
combinations dehned in [17] and illustrated in Figure 4 by assuming that ho- 
mologous recombination is influenced by the presence of certain contexts, i.e., 
either the presence of an lES or an MDS flanking a junction sequence. The ob- 
served dependence on the old macronuclear sequence for correct lES removal in 
Paramecium suggests that this is the case ([19]). This restriction captures the 
fact that the guide sequences do not contain all the information for accurate 
splicing during gene unscrambling. 





+ 

U X w 



X 




Fig. 4. Intra- and intermolecular recombinations using repeats x. During in- 
tramolecular recombination, after x hnds its second occurrence in uxvxw, the 
molecule undergoes a strand exchange in x that leads to the formation of two 
new molecules: a linear DNA molecule uxw and a circular one •vx. The reverse 
operation is intermolecular recombination. (From [14].) 



Using an approach developed in [15] we use contexts to restrict the use of re- 
combinations. A splicing scheme, [7], [8] is a pair (A, ~) where X is the alphabet 



How to Compute with DNA' 



279 



and the pairing relation of the scheme, is a binary relation between triplets of 
nonempty words satisfying the following condition: If (p,x,q) ~ {p\y,q') then 
X = y. In the splicing scheme (Ai, ~) pairs (j>,x,q) ~ {p',x,q') now define the 
contexts necessary for a recombination between the repeats x. Then we define 
contextual intramolecular recombination, [18] as 

{uxwxv} {uxv, uwx}, where u = u'p, w = qw' = w"p' , v = q'v' . 

This constrains intramolecular recombination within uxwxv to occur only if 
the restrictions of the splicing scheme concerning x are fulfilled, i.e., the first 
occurrence of x is preceded by p and followed by q and its second occurrence is 
preceded by p' and followed by q' . (See Figure 4.) 

Similarly, if (jp,x,q) ~ (p',x,q'), then we define contextual intermolecular 
recombination, [18]), as 

{uxv, »wx} => {uxwxv} where u = u'p, v = qv' , w = w'p' = q'w" . 

Informally, intermolecular recombination between the linear strand uxv and the 
circular strand »wx may take place only if the occurrence of x in the linear 
strand is flanked by p and q and its occurrence in the circular strand is flanked 
by p' and q' . Note that sequences p, x, q,p' , q' are nonempty, and that both con- 
textual intra- and intermolecular recombinations are reversible by introducing 
pairs {p,x,q') ~ (p',x,q) in (See Figure 4.) 

The above operations resemble the “splicing operation” introduced by Head 
in [7] and “circular splicing” ([8], [24], [21]). [20], [4] and subsequently [25] 

showed that these models have the computational power of a universal Turing 
machine. (See [22] for a review.) 

The operations defined in [17] are particular cases of guided recombination, 
where all the contexts are empty, i.e, (A,x, A) ~ (A,x, A) for all x G This 
corresponds to the case where recombination may occur between every repeat 
sequence, regardless of the contexts. These unguided (context-free) recombina- 
tions are computationally not very powerful: we can prove that they can only 
generate regular languages. 

If we use the classical notion of a set, we can assume that the strings entering 
a recombination are available for multiple operations. Similarly, there would be 
no restriction on the number of copies of each strand produced by recombination. 
However, we can also assume some strings are only available in a limited number 
of copies. Mathematically this translates into using multisets, where one keeps 
track of the number of copies of a string at each moment. In the style of [6], if N is 
the set of natural numbers, a multiset of X* is a mapping M : X* — > N U {oo}, 
where, for a word w G X*, M(w) represents the number of occurrences of w. 
Here, M{w) = oo means that there are unboundedly many copies of the string w. 
The set supp(M) = {w G M{w) yf 0}, the support of M, consists of the 
strings that are present at least once in the multiset M. 

We now define a guided recombination system that captures the series of 
dispersed homologous recombination events that take place during scrambled 
gene rearrangements in ciliates. 



280 



Lila Kari et al. 



Definition. ([18]) A guided recombination system is a triple R = (X, ~,yl) 
where {X, ~) is a splicing scheme, and A € is a linear string called the 
axiom. 

A guided recombination system R defines a derivation relation, [18], that 
produces a new multiset from a given multiset of linear and circular strands, as 
follows. Starting from a “collection” (multiset) of strings with a certain number 
of available copies of each string, the next multiset is derived from the first one by 
an intra- or inter-molecular recombination between existing strings. The strands 
participating in the recombination are “consumed” (their multiplicity decreases 
by 1) whereas the products of the recombination are added to the multiset (their 
multiplicity increases by 1). 

For two multisets S and S' in X* U X* , we say that S derives S' and we 
write S S', iff one of the following two cases hold, [18]: 

(1) there exist a € supp(5'), /3,*7 € supp(5") such that 

- {a} {/3, » 7 } according to an intramolecular recombination step in R, 

- S' {a) = S{a) - 1, S'{(3) = S{(3) + 1, ^'(. 7 ) = ^(. 7 ) + 1; 

(2) there exist a' ,*(3' € supp(5'), 7 ' e supp(S") such that 

- {a' , •(3'} { 7 ^} according to an intermolecular recombination step in R, 

- S' {a') = S{a') - 1, S'{.f3') = 5(./3') - 1, 5'(Y) = S{i) + 1. 

Those strands which, by repeated recombinations with initial and interme- 
diate strands eventually produce the axiom, form the language accepted by the 
guided recombination system. Formally, [18], 

L^{R) = {w G X*\ {w} S and A G supp(S')}, 

where the the multiplicity of w equals k. Note that L\{R) C T^+^(i?) for any 
fc > 1. 

Theorem. ([18]) Let L be a language over T* accepted by a Turing machine 
TM = (S', A U {ff},P) as above. Then there exist an alphabet X' , a sequence 
7 T G X'* , depending on L, and a recombination system R such that a word w 
over T* is in L if and only if fjA SQwffA n belongs to L\{R) for some k > 1. 

The preceding theorem implies that if a word w G T* is in L{TM), then 
ff^sowff^TT belongs to L^{R) for some k and therefore it belongs to L\{R) for 
any i > k. This means that, in order to simulate a computation of the Turing 
machine on w, any sufficiently large number of copies of the initial strand will 
do. The assumption that sufficiently many copies of the input strand are present 
at the beginning of the computation is in accordance with the fact that there are 
multiple copies of each strand available during the (polytene chromosome) stage 
where unscrambling occurs. Note that the preceding result is valid even if we 
allow interactions between circular strands or within a circular strand, particular 
cases of which have been formally defined in [17].1^®1 

The proof that a guided recombination system can simulate the computa- 
tion of a Turing machine suggests that the micronuclear gene, present in multiple 



How to Compute with DNA' 



281 



copies, consists of a sequence encoding the input data, combined with a sequence 
encoding a program, i.e., a list of encoded computation instructions. The “com- 
putation instructions” can be excised from the micronuclear gene and become 
circular “rules” that can recombine with the data. The process continues then by 
multiple intermolecular recombination steps involving the linear strand and cir- 
cular “rules” , as well as intramolecular recombinations within the linear strand 
itself. The resulting linear strand, which is the functional macronuclear copy of 
the gene, can then be viewed as the output of the computation performed on the 
input data following the computation instructions excised as circular strands. 

The last step, telomere addition and the excision of the strands between 
the telomere addition sites, can easily be added to our model as a final step 
consisting of the deletion of all the markers, rule delimiters and remaining rules 
from the output of the computation. This would result in a strand that contains 
only the output of the computation (macronuclear copy of the gene) flanked by 
end markers (telomere repeats) . This also provides a new interpretation for some 
of the vast quantity of non-encoding DNA found in micronuclear genes. 

In conclusion, in this section we presented a model proposed in [17] for the 
process of gene unscrambling in hypotrichous ciliates. While the model is consis- 
tent with our limited knowledge of this biological process, it needs to be rigor- 
ously tested using molecular genetics. We have shown, however, that the model 
is capable of universal computation. This both hints at future avenues for explor- 
ing biological computation and opens our eyes to the range of complex behaviors 
that may be possible in ciliates, and potentially available to other evolving ge- 
netic systems. 



References 

1. L.Adleman. Molecular computation of solutions to combinatorial problems. Science 
V.266, Nov. 1994, 1021-1024. 269, 270, 270, 277 

2. L.Adleman. On constructing a molecular computer. 1st DIMACS workshop on 
DNA based computers, Princeton, 1995. In DIMACS series, vol.27 (1996), 1-21. 
270 

3. E.Baum. Building an associative memory vastly larger than the brain. Seience, 
vol.268, April 1995, 583-585. 270 

4. E. Csuhaj-Varju, R.Freund, L.Kari, and G. Paun. DNA computing based on splic- 
ing: universality results. In Hunter, L. and T. Klein (editors). Proceedings of 1st 
Pacific Symposium on Biocomputing. World Scientific Pubh, Singapore, 1996, 179- 
190. 279 

5. M. Daley, L.Kari, G.Gloor, R.Siromoney. Circular contextual insertions/deletions 
with applications to biomolecular computation. Proceedings of String Processing 
and Information REtrieval ’99, Mexico, IEEE CS Press, 1999, in press. 269, 269, 
269, 270, 272, 272, 272, 273, 273, 273, 274, 274, 275 

6. S.Eilenberg. Automata, Languages and Machines. Academic Press, New York, 
1984. 279 

7. T.Head. Formal language theory and DNA: an analysis of the generative capacity 
of recombinant behaviors. Bulletin of Mathematical Biology, 49(1987), 737-759. 
278, 279 



282 



Lila Kari et al. 



8. T. Head. Splicing schemes and DNA. Lindenmayer systems, G.Rozenberg and A. 
Salomaa eds., Springer Verlag, Berlin, 1991, 371-383. 272, 278, 279 

9. T.Head, G.Paun, D.Pixton. Language theory and genetics. Generative mech- 
anisms suggested by DNA recombination. In Handbook of Formal Languages 
(G.Rozenberg, A. Salomaa eds.). Springer Verlag, 1996. 272 

10. D.C. Hoffman, and D.M. Prescott. Evolution of internal eliminated segments and 
scrambling in the micronuclear gene encoding DNA polymerase a in two Oxytricha 
species. Nucl. Acids Res. 25(1997), 1883-1889. 277 

11. L.Kari. On insertions and deletions in formal languages. Ph.D. thesis, University 
of Turku, Finland, 1991. 272 

12. L.Kari. DNA computing: arrival of biological mathematics. The Mathematical In- 
telligencer, vol.19, nr. 2, Spring 1997, 9-22. 270, 271, 271 

13. L.Kari. From Micro-Soft to Bio-Soft: Computing with DNA. Proceedings of 
BCEC’97 (Bio-Computing and Emergent Computation) Skovde, Sweden, World 
Scientific Publishing Co., 146-164. 

14. L.Kari, L.F.Landweber. Computational power of gene rearrangement. Proceedings 
of DNA Based Computers V, E.Winfree, D. Gifford eds., MIT, Boston, June 1999, 
203-213. 278 

15. L.Kari, G.Thierrin. Contextual insertions/deletions and computability. Informa- 
tion and Computation, 131, no.l (1996), 47-61. 272, 278 

16. J.Kendrew et al., eds. The Encyclopedia of Molecular Biology, Blackwell Science, 
Oxford, 1994. 271 

17. L.F.Landweber, L.Kari. The evolution of cellular computing: nature’s solution to 
a computational problem. Proceedings of 4th DIMACS meeting on DNA based 
computers, Philadephia, 1998, 3-15. 269, 270, 276, 277, 278, 279, 280, 281 

18. L.F.Landweber, L.Kari. Universal molecular computation in ciliates. In Evolution 

as Computation, L.F.Landweber, E,Winfree, Eds., Springer Verlag, 1999. 269, 

270, 276, 278, 279, 279, 280, 280, 280, 280, 280 

19. E. Meyer, and S.Duharcourt. Epigenetic Programming of Developmental Genome 
Rearrangements in Ciliates. Cell (1996) 87, 9-12. 278 

20. G.Paun. On the power of the splicing operation. Int. J. Comp. Math 59(1995), 
27-35. 279 

21. D.Pixton. Linear and circular splicing systems. Proceedings of the Eirst Interna- 
tional Symposium on Intelligence in Neural and Biological Systems , IEEE Com- 
puter Society Press, Los Alamos, 1995, 181-188. 272, 279 

22. G.Rozenberg, and A. Salomaa eds. Handbook of Formal Languages, Springer Verlag, 
Berlin, 1997. 272, 279 

23. A. Salomaa. Formal Languages. Academic Press, New York, 1973. 272, 273 

24. R. Siromoney, K.G. Subramanian and Dare Rajkumar, Circular DNA and splic- 
ing systems. In Parallel Image Analysis. Lecture Notes in Computer Science 654, 
Springer Verlag, Berlin, 1992, 260-273. 272, 279 

25. T. Yokomori, S. Kobayashi and C. Ferretti. Circular splicing systems and DNA 
computability Proc. of IEEE International Conference on Evolutionary Computa- 
tion’97, 1997, 219-224. 272, 279 



A High Girth Graph Gonstruction and a Lower 
Bound for Hitting Set Size for Gombinatorial 

Rectangles* 

L. Sunil Chandran 

Indian Institute of Science, Bangalore 
sunilOcsa . iisc . ernet . in 



Abstract. We give the following two results. 

First, we give a deterministic algorithm which constructs a graph of girth 
logj,(n) + 0(1) and minimum degree k — 1, taking number of nodes n 
and the number of edges e = \nk/2\ as input. The graphs constructed by 
our algorithm are expanders of sub-linear sized subsets, that is subsets 
of size at most n'*, where <5 < |. Though methods which construct high 
girth graphs are known, the proof of our construction uses only very 
simple counting arguments in comparison. Also our algorithm works for 
all values of n or fc. 

We also give a lower bound of m/8e for the size of hitting sets for com- 
binatorial rectangles of volume e. This result is an improvement of the 
previously known lower bound, namely f2[m + 1/e -|- log(d)). The known 
upper bound for the size of the hitting set is m poZi/(log(d)/e). [LLSZ]. 



1 Introduction 

Expander graphs are those in which the neighbourhood of any not too large subset 
of vertices has large cardinality. These graphs have been used in several applica- 
tions, e.g., parallel sorting networks [AKS], coding[SS], hitting combinatorial rect- 
angles [LLSZ]. Explicit construction of regular expander graphs was a hard prob- 
lem. Some of the explicit constructions are given in [Marg73,Marg88,GabGal] 
[LPS] etc. 

Our original aim in this paper was to see if simple rules like successively 
adding edges to an initially empty graph with each edge connecting vertices at 
a large distance in the current graph, could give provable expansion properties. 
The main result we obtain is that simple rules of the above form do indeed yield 
graphs which have large expansion provably, but for sublinear- sized subsets. In 
fact, the graphs we obtain will actually be high girth graphs, where girth is 
defined to be the length of the shortest cycle. Expansion on sublinear-sized 
subsets will follow from a result of Kahale’s[Kahj. 



* Dept, of Computer Science and Automation, Indian Institute of Science, Bangalore, 
560012. This work is based on the author’s masters thesis. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 283—290, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



284 



L. Sunil Chandran 



The Main Result. 

We give an algorithm which takes a positive integer n and an “expected 
degree” k as input and creates a graph G = (V,E) with n nodes, [nk/2\ edges 
and whose girth g satisfies the relation 

n (fc+ l)(fcg-^ - 1) 

2 (fc- 1) 

It follows that g > log^(n) +0(1). We prove that the degree of any node in the 
graph constructed by our algorithm will be fc — 1,A:, or fc + 1. Thus given the 
values of girth (say g) and minimum degree (say t) our algorithm can be used to 

construct graphs with at most 2+ 1) jjodes. Note that this bound 

is comparable to the existential results of Erdos and Sach[ES] and Sauer [S], who 
showed that the minimum number of vertices n{g, t) required for the girth to be 
> g and minimum degree > t satisfies: 



n{g,t) < 2 



(t- 1)3-1 _ 
t-2 



,if g is odd 



(t — 1 ) 3-2 _ ^ 

n{g,t) < 4- ^ ,if 9 is even 



(See also [Boll] page 107). 

That our high-girth graphs have good expansion on sublinear-sized subsets is 
shown as follows. Let X be a subset of V. N{X) = {v € V : 3u € X, {u, f} S E}, 
is defined to be the set of neighbours of X. The expansion of X is defined 
to be . In [Kah], Kahale proves that if G is a fc regular graph of girth 

(| + o(l)) log;._]^ n then for any subset X of E of size at most (where 0 < 

<5<1), 



mx)\ ^ 
|x| - 



fc - (fc - i)(3+°(i»^ 



It is implicit in the proof of this theorem that if we use a graph of girth logj,(n) + 
0(1) and minimum degree = fc — 1, then every subset X of size at most will 

have expansion /3 = such that the following inequality is satisfied. 



> (fc-1- [/3])3(i°g.(0+0(i)) 



Solving this inequality we get j3 > (fc— 2)— fc^*^. Thus the graph constructed by our 
algorithm is such that each subset X, jXj < , has expansion > (fc — 2) — fc^"^. 

When (5 < |, the effect of the second term becomes negligible for sufficiently 
large fc. 

Other Known High-Girth Graph Gonstructions. 

Two other constructions which give values of girth greater than what we 
achieve are Ramanujan graphs by Lubotzky, Philip and Sarnak [LPS] and cubic 
sextet graphs [Weiss, BHj. For example, the girth of the graphs constructed in 
[LPS] is > (| + o(l)) log;._]^ n. Our achievement is that the proof of our algorithm 



A High Girth Graph Construction 285 



uses only elementary counting arguments. Also our construction works for all n, 
and all fc < n. In comparison, Ramanujan graph construction takes two unequal 
primes p and q, both congruent to 1 mod 4, and constructs p + 1-regular graphs 
with either t or t/2 vertices where t = q{q^ — 1). But these graphs are regular 
while ours is only approximately regular. The construction of cubic-sextet graphs 
given by [Weiss, BH] gives high girth graphs which are 3-regular, but doesn’t gen- 
eralize to arbitrary specified minimum degree. We have presented our algorithm 
assuming that n is even. Later we describe how to generalize it to the case of 
odd n also. 

An Auxiliary Result: Lower Bounding Hitting Set Size. 

As mentioned above, one of the applications of Expander Graphs is in the 
construction of small hitting sets for combinatorial rectangles, defined as follows. 

Let d and m be positive integers. Let U be {1, 2, • • • , m}^. That is the 
universe U consists of all d dimensional lattice points with all coordinates in 
[to] = {1,2, •• • ,to}. a combinatorial rectangle is defined as a set of the form 
R = Ri X R 2 X R^x ■ ■ ■ X Rd, where Ri C [to] for alH G {1, 2, • • • , d}. The volume 

of R is defined as M 

\U\ 

An (to, d, e) hitting set is defined as a subset S C U, such that for any 
combinatorial rectangle R, with volume vol{R) > e, 

It is proved in [LLSZ] that, if Sd is an {m,d,e) hitting set, [5^1 = 17 (to -|- 
1/e -I- logd) when m~‘^ < e < 2/9. The value of e should be in this range in 
order to prove the logd term. In this paper, we strengthen this lower bound 
by observing that a suitable graph associated with any hitting set must have 
expansion properties. Using this observation, we prove a lower bound of to/Sc, 
when 1 /to‘^“^ < 4e < 1 and d> 2. 



2 High-Girth Graph Construction 

The following algorithm takes the number of nodes n and the average degree k 
as input and constructs a graph of girth at least log;.(n) -1-0(1). All the nodes 
in the graph will have degree k — l^k or k + 1. 

The Algorithm. 

Let n be an even integer. (This is just for convenience. We will describe the 
case when n is odd shortly). Assume that in the beginning, we have a perfect 
matching on the n nodes. That is we start with a graph having ^ edges, the 
degree of each node being 1. Do for i = ^ -|- 1 to ^ : 

1. Let S = {u : degree{u) < degree{v) ,\/v € U}. 

2. Let T = {(m,u) € S xV : distance{u,v) > distance{x,y),y (x,y) G S x V} 

3. If there is a pair {u,v) G T, such that degree{v) < j, where j = ]"^], and 
the edge {rt, u} is already not there in the graph, introduce a new edge {rt, u} 
and go to start of the loop. If there are several such pairs pick one arbitrarily. 
Else go to 4. 



286 



L. Sunil Chandran 



4. Let p = distance{u, v), {u, v) G T . Put p = p—1 . Now assign T = {(u, u) G 

S xV : distance{u,v) = p}. Go to 3. 

( The above algorithm does the following. In step 1, it collects in S all the vertices 
having the least degree in the graph. Next it collects in T, all the pairs of nodes 
(u, v) such that distance{u, v) is maximum from the set of all pairs (u, v) with u 
in S. Put an edge (if it is not already there) between one such pair from T, 
making sure that both u and v have degree less than or equal to j = [|f]. If 
no {u, v) satisfies this degree requirement let T be redefined as the set of pairs 
(u, v), such that m is in S' and distance{u, v) = p— 1, where p = distance between 
any pair (u,v) which was in T earlier). 

Lemma 1. The graph created by the above algorithm will be such that Vu G V , 
degree(v) will be k — l,k, or k + I- IfVd denotes the set of vertices with degree d 
in the graph, |I4-i| = |Vfe+i| < f . 

Proof. Let us use induction, as each batch of § edges are introduced. That is as 
the average degree of the graph goes up by 1. (Note that the parameter j = 
remains constant as a batch of ^ edges are introduced and when the next batch 
starts, it gets incremented by 1). 

Consider the following induction hypothesis. 

When i = ^ iterations of the loop are over, the degree of any node in the 
graph will bed— l,dorc?+l. Let X be the set of vertices with degree d—l,Y 
that of vertices with degree d and Z that of vertices with degree d + 1. Then 
|X| = |Z|. 

In the beginning of the algorithm, average degree = 1 and the statement is 
true because there are only vertices of degree 1 in the graph. We have |X| = 
|Z|=0. 

Now assuming the induction hypothesis after d batches of ^ edges are in- 
troduced, let us prove that it is true even after the next batch is introduced. 
We note that since \X\ = \Z\, |X| < Thus when ^ edges are introduced, 
each vertex in the set X gets a chance to increase its degree (because minimum 
degree vertices have preference ). So after ^ edges are introduced there will be 
no vertices of degree d — 1 left. Further no vertex will have degree > (d -I- 2), 
because while these edges are introduced the parameter j will remain equal to 
d -I- 1 . Observe that since the sum of degrees during this stage can reach a max- 
imum of (d -I- l)n only, there is always at least 2 nodes of degree less than d -I- 2 
and therefore the algorithm will add an edge on each iteration. Thus after the 
new ^ edges are introduced we retain the induction hypothesis statement that 
the degrees are d, d -I- 1 or d -I- 2. Now let Xi,Yi,Zi be the new sets of d, d -I- 1 
and d -I- 2 degree vertices respectively. It suffices to show that Xi = Z\. We 
know that 

|^i|d-|-|yi|(d-|-I)-|-|2^i|(d-|-2) = (d-|-I)n. (I) 

We also have |Xi| -I- |Fi| -I- |Zi| =n 

Therefore, \Z\\ = |Xi|. Thus we get back the induction hypotheses. Hence the 
result. 




A High Girth Graph Construction 287 



Construction for Odd n. 

If n is odd, one may start with a n-length cycle C”, instead of a matching. 
Then at the beginning of the loop i = n + 1. Again, when we start the algorithm, 
every node has a degree of 2, which is equal to the average degree. Now think of 
introducing batches of and [ edges alternately. That is each odd numbered 
batch will contain [|-J edges and every even numbered batch will contain 
edges. We can consider a slightly different induction hypothesis. If k is even, that 
is after we have introduced ^ edges every node in the graph will be of degree 
fc — 1, fc or + 1. Moreover I 4+1 = I4_i implying that |I4+i | = |I4_i | < ^. If fc 
is odd, that is after introducing edges, every node in the graph will be of 
degree fc— 1, k or fc+1 as before. But |I4+i| = |I4-i| — 1- We still have |I4+i| < 
The induction hypothesis also implies that just after each odd numbered batch 
of edges is introduced |I4_i | < [f ] and just after every even numbered batch is 
introduced |I4_i| < [§J. Thus when a new batch of edges is introduced, each 
node in the set Vk-i will increase its degree as before. Also no node can attain 
a degree > (fc + 2). Thus when k + I is even, we get back the same equation as 
equation 1. When A: + 1 is odd, since we have only introduced 
the equation becomes 



n(/c+l) 

2 



J edges. 



\Vk\k + |Vfe+i |(fc + 1) + |Vfc+ 2 |(fc + 2) — (fc + l)n — 1 (2) 

Solving this we get II 4 + 2 I = |I4| - 1. Thus again II 4 + 2 I < |. 

Theorem 1. The above algorithm creates a graph which has a girth 

g > logfe(n) +0(1) 

Proof. Look at the final graph. Let the girth be g. This cycle (girdle) closed 
sometime back in the process. Go back to the stage, just before closing the 
smallest cycle. Let d = \^~\ , where i is the loop iteration number at that time. 
For that iteration, we had selected a current minimum degree vertex, u. That is 
we had selected a pair (u, v), with u G S. Let B = {x gV : distance{u, x) > g}. 
Why did not we select a vertex from B to be connected with ul Because those 
vertices, if any, had already achieved a degree of (d+ 1). The algorithm prohibits 
to connect to them. But we know by lemma 1 that the number vertices of degree 
d + I can at most be Thus V — B contains at least ^ nodes. Thus using 
the fact that fc + l>d+l, (d+1 being the maximum degree of the graph at 
that stage), the maximum number of nodes possible in V — B is 1 + (fc + 1)+ 
(fc+l)fc + - • • + (fc+l)fc*^®“^). Combining the lower and upper bounds for \V — B\, 
we have, 

^ < 1 + (fc + 1) + (fc + l)fc + • • • + (fc + l)fc(9-2) 

n (fc+ l)(fcg-i - 1) 

2 (fc - 1) 



which gives 



logfcW + 0(1) < g 



288 



L. Sunil Chandran 



3 A Lower Bound for Hitting Set Size 

Let d and m be positive integers. U be {1, 2, • • • , m}'^. That is the universe U con- 
sists of all d dimensional lattice points with all coordinates in [m] = {1, 2, • • • , m}. 
A combinatorial rectangle is defined as a set of the form R = Ri x R2 x R3 x 
■ ■ ■ X Rd, where Ri C [m] for all i G {1, 2, • • • , d}. The volume of R is defined 

as ^ . We will call a combinatorial rectangle with volume e in C/ a 

(m, d, e) combinatorial rectangle. 

An (m, d, e) hitting set is defined as a subset Sd C [7, such that for any 
combinatorial rectangle R, with volume vol{R) > e, SdC\R ^ In the following 
theorem we give a lower bound for the size of a (m, d, e) hitting set. 

Theorem 2. 

\Sd\ > m/8e 

when d > 2 and < 4e < 1. 

Proof. Any (m, d, e) hitting set Sd can be viewed as a bipartite graph between 
the sets A = and B = [m]. Introduce an edge from {xi,X2, • • • , Xd-i) G A 

to Xd G B if and only if (xi,X2, ■ ■ ■ , Xd) G Sd- Now observe that the number of 
edges starting from any (m, d — l,2e) combinatorial rectangle R in A should 
be at least Otherwise there will be a subset C of S, \C\ > such that 
N{R) (^(7=0, where N{R) is the set of neighbours of R. Then Rx C will be a 
combinatorial rectangle in [m]^ with volume > e, which is not hit by Sd, which 
will be a contradiction. Thus if there are l/4e non-intersecting combinatorial 
rectangles of volume 2e in it is obvious from the above argument that |S'd| 

should be at least m/8e. Instead of proving this directly, we choose to prove the 
result as follows. 

It is easy to see that one can choose positive integers Ti,T2, ■ ■ ■ , Td-i, all at 
most m, such that 4em'^“^ > TiT2---Td-i > Select subsets 

Ri, R2, ■ ■ ■ , Rd-i of [m] where Ri is chosen uniformly at random from among 
all subsets of size Ti. Then for any given edge in the bipartite graph, the prob- 
ability of it starting from a point in R = R\ x R2 x ■■■ x Rd-i C A will be 
< 4e. Thus the expected number of edges starting from points in R 
will be at most dejS'dl. As we have seen above this expectation should be at 
least So \Sd\ > mf8e. This completes the proof. 

4 Conclusion 

The high girth graph construction was invented while we were trying to explicitly 
construct expander graphs. The problem of explicitly constructing an asymptotic 
family of expander graphs which guarantees expansion for linear sized subsets 
is considered to be a tough problem. When Margulis gave the first explicit con- 
struction it was considered as a breakthrough [Marg73]. Other known explicit 
constructions are given in [GabGal,LPS,Marg88] etc. It is known that graphs 
whose second smallest eigen value of their Laplacian matrix is well separated 



A High Girth Graph Construction 289 



from 0 are good expanders [Alon] . Some limited experiments we carried out sug- 
gest that the construction we have given in fact construct expanders of linear 
sized sets, though we couldn’t prove it. But we point out that high girth and high 
minimum degree alone cannot guarantee expansion for linear sized subsets. For 
example, let G be a high girth graph with n nodes. Let the girth of this graph 
be l7(logn). Suppose we are interested in the expansion of subsets of size ^ or 

less of the total nodes. Then construct a graph G of order qn+1 by connecting q 
copies of G to a central node as shown in figure 1. Note that girth of G will be 
again l7(log((7n -|- 1)), but the required expansion is not available. 




Fig. 1. 



References 

AKS. M.Ajtai, J.Komlos and E.Szemeredi. Sorting in clogn parallel steps. Combi- 
natorica 3 (1983), 1-19. 283 

Alon. N.Alon. Eigen values and expanders. Combinatorica 6 (1986), no. 2, 83-96. 
289 

BH. N.L. Biggs and M.J.Hoare. The sextet construction for cubic graphs. Combi- 

natorica 3 (1983), 153-165. 284, 285 

Boll. B.Bollobas. Extremal Graph theory. Academic press, London, 1978. 284 
ES. P.Erdos and H. Sachs. Regulare graphe gegebener taillenweite mit minimaler 
knotenzahl. Math. Nat. 12 (1963), no. 3, 251-258. 284 
GabGal. O. Gabber and Z.Galil, Explicit constructions of linear sized super concentra- 
tors. J. Comput. System Scie. 22 (1981), no. 3, 407-420. 283, 288 
Kah. N.Kahale. PhD Thesis. Massachusetts Institute of Technology 1993. 283, 284 
LLSZ. N.Linial, M.Luby, M.Saks, D.Zuckerman. Efficient construction of a small 
hitting set for combinatorial rectangles in high dimension. Combinatorica. 
17(2) 1997, 215-234. 283, 283, 285 



290 



L. Sunil Chandran 



LPS. A.Lubotzky, R.Philip, and P.Sarnak. Ramanujan graphs. Combinatorica 8 
(1988) no. 3, 261-271. 283, 284, 284, 288 

Marg73. G.A.Margulis. Explicit constructions of concentrators. Problemy peredaci in- 
formacii 9 (1973) no.), 71-80. 283, 288 

Marg88. G.A.Margulis. Explicit group theoretical constructions of combinatorial 
schemes and their application to the design of expanders and concentrators. 
Problemy peredaci informacii 2) (1988) no. 1,51-60. 283, 288 
NZ. N.Nisan and D.Zuckerman. More deterministic simulation in logspace Proc. 

of the 25th ACM Symposium on the Theory of Computation 1993, pp 235- 

244 - 

S. N. Sauer. Extremaleigenschaften regularer Graphen gegebener Taillenweite I 

and II Sitzungberichte Osterreich. Akad. Wiss. Math. Vatur. Kl. S-B II. 176 
(1967) 9-25, ibid 176 (1967) 27-)3. 284 

SS. M. Sipser and D. Spielman. Expander Godes. IEEE Transactions on Infor- 
mation Theory, 1996, Vol )2, No 6, 1710-1722. 283 
Weiss. A. I. Weiss. Girth of bipartite sextet graphs Combinatorica 4 (1984), 2)1-245. 

284, 285 



Protecting Facets in Layered Manufacturing* 



Jorg Schwerdt^, Michiel Smid^, Ravi Janardan^, Eric Johnson^, and 

Jayanth Majhi^ 

^ Department of Computer Science, University of Magdeburg, 
Magdeburg, Germany 

{schwerdt , michiel}® isg. cs.uni-magdeburg.de 
^ Department of Computer Science and Engineering, University of Minnesota, 
Minneapolis, MN 55455, USA 
{ janardan, johnson}@cs .uinn.edu 
® Mentor Graphics Corporation, 

8005 S.W. Boeckman Road, Wilsonville, OR 97070, USA 
jayanth_majhi®mentorg. com 



1 Introduction 

Layered Manufacturing (LM) is an emerging technology that is gaining impor- 
tance in the manufacturing industry. (See e.g. the book by Jacobs [7].) This 
technology makes it possible to rapidly build three-dimensional objects directly 
from their computer representations on a desktop-sized machine connected to 
a workstation. A specific process of LM, that is widely in use, is StereoLithog- 
raphy. The input to this process is the triangulated boundary of a polyhedral 
CAD model. This model is first sliced by horizontal planes into layers. Then, 
the object is built layer by layer in the following way. The StereoLithography 
apparatus consists of a vat of photocurable liquid resin, a platform, and a laser. 
Initially, the platform is below the surface of the resin at a depth equal to the 
layer thickness. The laser traces out the contour of the first slice on the sur- 
face and then hatches the interior, which hardens to a depth equal to the layer 
thickness. In this way, the first layer is created; it rests on the platform. Then, 
the platform is lowered by the layer thickness and the just-vacated region is 
re-coated with resin. The subsequent layers are then built in the same way. 

It may happen that the current layer overhangs the previous one. Since this 
leads to instabilities during the process, so-called support structures are gener- 
ated to prop up the portions of the current layer that overhang the previous 
layer. (See Figure 1 for an illustration in two dimensions.) These support struc- 
tures are computed before the process starts. They are also sliced into layers, 
and built simultaneously with the object. After the object has been built, the 
supports are removed. Finally, the object is postprocessed in order to remove 
residual traces of the supports. 

* This work was funded in part by a joint research grant by DAAD and by NSF. RJ, 
EJ and JM were also supported in part by NSF grant CCR-9712226. Part of this 
work was done while JS and MS visited the University of Minnesota in Minneapolis. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 291—303, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



292 Jorg Schwerdt et al. 



An important issue in this process is choosing an orientation of the model 
so that it can be built in the vertical direction. Equivalently, we can keep the 
model fixed, and choose a direction in which the model is built layer by layer. 
This direction is called the build direction. It affects the number of layers, the 
surface finish, the quantity of support structures used, and their location on the 
object being built. 

1.1 Overview of Our Results 

Let V be the three-dimensional polyhedron that we want to build using LM. V 
may have holes, but we assume that it is non-self-intersecting. In this paper, we 
consider the problem of computing all build directions for V for which a pre- 
scribed facet is not in contact with supports (i.e., protected from supports). This 
is an important problem because support removal from a facet can affect surface 
quality and accuracy adversely, thereby impacting the functional properties of 
critical facets, such as, for instance, facets on gear teeth. This problem, which 
we describe below, arose from discussions with engineers at Stratasys, Inc. — a 
Minnesota-based world leader in LM. To our knowledge, the work presented 
here constitutes the first provably correct, complete, and efficient solution to 
this important problem; current practice in industry is based on trial and error. 
Throughout, we assume that the facets of V are triangles. (This is the standard 
STL format used in industry.) The number of facets of V is denoted by n. We 
solve the following problems: 

Problem 1. Given a facet F of V, compute a description of all build directions 
for which F is not in contact with supports. 



Problem 2. Compute a description of all build directions for which the total area 
of all facets of V that are not in contact with supports is maximum. 

In Section 2, we give a formal definition of the notion of a facet being in 
contact with supports. In Section 3, we give an algorithm that solves Problem 1 
in O(n^) time. This result also implies an 0(fc^n^)-time algorithm to protect 
any set of A: > 1 facets of V from supports. We have implemented a simplified 
version of this algorithm. Test results on models obtained from industry are 
given in Section 3.3. In Section 3.4, we show that Problem 2 can be solved in 
O(n^) time. Section 4 concludes with some open problems. 

The algorithms solving Problems 1 and 2 use fundamental concepts from 
computational geometry, such as convex hulls, arrangements of line segments, 
and the overlay of planar graphs. These concepts, however, are applied to points 
and segments on the unit sphere. A complete version of this paper appears as [14] . 

1.2 Prior Related Work 

The problem of computing a “good” build direction has been considered in 
the literature. Asberg et al.[2] give efficient algorithms that decide if a model 
can be made by StereoLithography without using support structures. Allen and 



Protecting Facets in Layered Manufacturing 293 



Dutta [1] consider the problem of minimizing the total area of all parts of the 
model that are in contact with support structures. They give a heuristic for 
this problem, but without any analysis of the running time or the quality of 
the approximation. Bablani and Bagchi [3] present heuristics for improving the 
accuracy and finish of the part. 

In our previous work [8,9,10,15], we have used techniques from computational 
geometry to compute build directions that optimize various design criteria. In [9] , 
algorithms are given that minimize, for convex polyhedra, the volume of support 
structures used, and, independently, the total area of those parts of the model 
that are in contact with support structures. Both algorithms have a running 
time that is bounded by 0{n?), where n is the number of facets. For general 
polyhedra, it is shown that a build direction that minimizes the so-called stair- 
step error can be computed in 0(n log n) time. In [10], algorithms are given that 
minimize a combination of these measures. (For all measures that involve support 
structures, the algorithms only work for convex polyhedra.) In [8], algorithms 
are given that minimize support structures for two-dimensional simple polygons. 
Finally, in [15], the implementation of an algorithm that minimizes the number 
of layers, is discussed. This algorithm works for general polyhedra. 

While writing the final version of the current paper, we became aware of 
related work by Nurmi and Sack [12]. (Private communication from J.-R. Sack.) 
They consider the following problem: Given a convex polyhedron A and a set of 
convex polyhedral obstacles, compute all directions of translations that move A 
arbitrarily far away such that no collision occurs between A and any of the 
obstacles. If we take for A a facet F of a polyhedron V, and for the obstacles the 
other facets of V, then we basically get Problem 1. Our algorithm for solving 
Problem 1 is similar to that of Nurmi and Sack. However, our algorithm is 
tailored to our particular application, and takes advantage of the fact that the 
objects of interest are the facets of a polyhedron. Moreover, we give rigorous 
proofs that handle all boundary cases — something that is crucial when deploying 
our algorithm in an actual LM application, since such boundary cases are very 
common in real-world STL files. 

2 Geometric Preliminaries 

The number of facets (i.e., triangles) of V is denoted by n. We consider each 
facet and edge of V as being closed. We allow that a vertex of one facet is in the 
interior of an edge e of another facet. 

The unit sphere, i.e., the boundary of the three-dimensional ball centered at 
the origin and having radius one, is denoted by §^. We consider directions as 
points — or unit vectors — on S^. For any point a: € and any direction d € S^, 
we denote by r^a., the ray emanating from x having direction d. 

We now formally define the notion of a point or facet being in contact with 
supports for a given build direction. (See also Figure 1 for the two-dimensional 
variants.) It turns out to be convenient to distinguish three cases. 

Let F be a facet of V, and d S a direction. Let a be the angle between d 
and the outer normal of F. Note that 0 < a < tt. If a < tt/ 2, then we say that F 



294 Jorg Schwerdt et al. 



is a front facet w.r.t. d. Similarly, if a > tt/2, then we say that F \s & hack facet 
w.r.t. d. Finally, if a = tt/2, then we say that F’ is a parallel facet w.r.t. d, or 
that d is parallel to F. 

Definition 1. Let F be a facet of V, x a, point on F, and d a direction on 
Point X is in contact with supports for build direction d, if one of the following 
three conditions holds. 

1. F is a back facet w.r.t. d. 

2. F is a front facet w.r.t. d, and the ray r^d. intersects the boundary of V in 
a point that is not on facet F. 

3. F is a parallel facet w.r.t. d, and there is a facet G such that (a) the ray r^d 
intersects G, and (b) at least one of the vertices of G is strictly on the same 
side of the plane through F as the outer normal of F. 



Definition 2. Let F be a facet of V, and d a direction on S^. We say that F is 
in contact with supports for build direction d, if there is a point in the interior 
of F that is in contact with supports for build direction d. 



16 15 




Fig. 1. Illustrating the two-dimensional variant of Definitions 1 and 2 for a 
planar simple polygon with 19 vertices. The shaded regions are the supports 
for the vertical build direction d. No interior point of the vertical edges (5, 6) 
and (8,9) is in contact with supports. On the other hand, all points of the 
vertical edge (10,11) are in contact with supports. For build direction d, the 
following edges are in contact with supports: (1,2), (2,3), (3,4), (4,5), (6,7), 
(7,8), (10,11), (11,12), (12,13), (13,14), (17,18), and (18,19). 



In this paper, we will need two notions of convexity. The (standard) convex 
hull of the points p\,p 2 , . . . ,Pk will be denoted by by CH{pi,p 2 , . . . ,Pk)- This 



Protecting Facets in Layered Manufacturing 295 



notion can be generalized to spherical convexity for directions on in a natural 
way. (See also Chen and Woo [4].) Note that if a set of directions contains two 
antipodal points, their spherical convex hull is the entire unit sphere. We say 
that a set of directions on is hemispherical, if there is a three-dimensional 
plane H through the origin, such that all elements of D are strictly on one side 
of H. 

3 Protecting One Facet from Supports 

Throughout this section, F denotes a fixed facet of V. We consider the problem 
of computing a description of all build directions d, such that facet F is not in 
contact with supports, when V is built in direction d. 

The idea is as follows. For each facet G, we define a set CpG C of di- 
rections, such that for each d e Cpc, facet F is in contact with supports for 
build direction d “because of” facet G. That is, there is a point x in the interior 
of F , such that the ray r^d emanating from x and having direction d intersects 
facet G. Hence, for each direction in the complement of the union of all sets Cfg^ 
facet F is not in contact with supports. It will turn out, however, that we have 
to be careful with directions that are on the boundary of a set Cfg- 

For any facet G oiV, and any point a; G we define 

:= {d G §2 : n G) \ {x} ^ 0}. 

Hence, if d G RxG, then the ray from x in direction d intersects facet G in a 
point that is not equal to x. For any facet G of 7^, we define 

Gfg '■= [J RxG- 

xeF 

For any facet G, we denote by Pg the great circle consisting of all directions 
parallel to G. 

Lemma 1. Let G he a facet that is not coplanar with F. Assume that (i) F 
and G have more than one point in common, or (ii) F and G intersect in a 
single point, which is a vertex of one facet and in the interior of an edge of the 
other facet. Also, assume that for each vertex of G, one of the following is true: 
It is in the plane through F or on the same side of the plane through F as the 
outer normal of F. Then Gfg is the set of all directions that are (1) on or on 
the same side of Pf as the outer normal of facet F, and (2) on or on the same 
side of Pg as the inner normal of facet G. □ 

Let G be a facet of V. Assume that either F and G are disjoint, or intersect 
in a single point which is a vertex of both facets. It can be shown that Gfg is 
hemispherical. We will show now how to compute Cfg- 

First, we introduce some notation. Let s = (sx,Sy,Sz) and t = (G,ty,tz) 
be two distinct points in and let £ be the Euclidean length of the point 
t — s := (G — Sx, ty — Sy, fz — Sz). That is, £ is the Euclidean distance between t — s 



296 Jorg Schwerdt et al. 



and the origin. We will denote by dg* the point on having the same direction 
as the directed line segment from s to t. That is, dgt is the point on having 
coordinates dg* = ((t^ - Sx)/^, (ty - Sy)/i, (tz - Sz)/i). 

Lemma 2. Let G he a facet ofV. Assume that F and G are disjoint, or intersect 
in a single point which is a vertex of both facets. Let 

Dpo ■= {dgt GEf : s is a vertex of F, t is a vertex of G, s =/= t}. 



Then Gfg is the spherical convex hull of the at most nine directions in Dpo- 



Proof. We denote the spherical convex hull of the elements of Dpc by SCHfg- 
We will prove that (i) Gfg SCHfg, and (ii) Gfg is spherically convex. Since 
SCHfg is the smallest spherically convex set that contains the elements of Dfg, 
the fact that Dfg C Gfg, together with (i) and (ii), imply that Gfg = SCHfg- 

To prove (i), let d G Cfg- We will show that d G SCHfg- Since d G Cfg, 
there is a point x on F such that d G RxG- Let y be any point such that y ^ x 
and y G rxd H G. 

Let a, b, and c denote the three vertices of F, and u, v, and w the three vertices 
of G. Since x G GH{a, b, c), we have y — x G GH{y — a, y — b, y — c). Similarly, since 
y G CH (u, v,w), we have y — a€ CH {u—a,v — a,w — a) =: GH\, y — b€ CH {u — 
b,v — b,w — b) =: CH2, and y — c € GH{u — c, v — c, w — c) =: CH3. Therefore, 
CH{y — a,y — b,y — c) C CH {CHi, CH2, CH3). Hence, if we denote the (standard) 
convex hull of the point set {t — s : s is a vertex of F, t is a vertex of G} by 
CHfg, then we have shown that y — x € CH{CHi, CH2, CH3) = CHfg- Since 
y — X ^ 0, and the ray from the origin through point y — x has direction d, it 
follows that d G SGHfg- 

To prove (ii), let di and d2 be two distinct directions in Gfg, and let d be 
any direction on the shortest great arc that connects di and d2. We have to show 
that d G Cfg- Let H be the plane through di, d2, and the origin. Then d is 
also contained in H. Let d' G be the intersection between the line through di 
and d2, and the line through the origin and d. (Note that d' is not the origin.) 
Let 0 < A < 1 be such that d' = Adi + (1 — A)d2. Since di G Gfg, there is a 
point xi on F such that the ray from xi in direction di intersects G in a point 
that is not equal to xi. Let yi be any such intersection point. Similarly, there 
are points X2 on F, and j/2 on G, such that 7/2 ^ X2 and the ray from X2 in 
direction d2 intersects G in 7/2- Let a, a\, and «2 be the positive real numbers 
such that d = ad' , y\ — x\ = oidi, and y2~ X2 = «2d2. Then 



d = 



aX 

ai ' 



- 2/1 



a(l-A) \ 

2/2 

02 / 



aX 

— Xi + 
ai 



a(l-X) ) 

X2 - 

02 / 



Let y such that y{aX/ai +a(l — A)/a2) = 1 - Define x := {yaX/ ai)x\ + {ya{l — 
X)/a2)x2, and y := {yaX/ai)yi + {ya{l — X)/a2)y2- Then y > 0, and yd = y — x. 
Hence a; is a convex combination of and X2, and 7/ is a convex combination 
of 7/1 and 7/2 • It follows that x and y are points on the facets F and G, respectively. 
Moreover, y ^ x. Hence, d G Cfg- LI 




Protecting Facets in Layered Manufacturing 297 



3.1 More Properties of the Sets Cfg 

We say that a facet G is below facet F, if each vertex of G is in the plane 
through F or on the same side of this plane as the inner normal of F. 

Lemma 3. Let d be a direction on such that F is a front facet w.r.t. d. 
Assume that F is in contact with supports for build direction d. Then there is a 
facet G, such that (i) G is not below F, and (ii) d is in the interior of Cfg- 

Proof. Assume w.l.o.g. that d is the vertically upwards direction. Hence, F is 
not vertical. There is a disk Dq in the interior of F, having positive radius, such 
that each point of Dq is in contact with supports for build direction d. That 
is, the vertical ray that emanates from any point of Dq intersects the boundary 
of P in a point that is not on F. Thus, there is a disk D of positive radius in 
the interior of Dq, and a facet G, G ^ F, such that for each x £ D, the ray r^d 
intersects G. Put differently, let VC := {xxd '■ x £ B}, i.e., VC is the vertical 
“cylinder” which is unbounded in direction d and is bounded from below by B. 
Since F is not vertical, the cylinder VC is not contained in a plane. If we move 
the disk B vertically upwards, then it stays in the cylinder VC, and each point 
of the moving disk passes through G. Clearly, facet G is not below F. 

Let c and e be the center and radius of B, respectively. Since B is not 
contained in a vertical plane, and e > 0, there is a spherical disk SB on 
centered at d and having positive radius, such that for each d' £ SB, the ray red' 
intersects G in a point that is not equal to c. This implies that SB is completely 
contained in Cfg- Since d is in the interior of SB, it follows that d is in the 
interior of Cfg- C 

The following lemma is the converse of Lemma 3. 

Lemma 4. Let G be a facet ofV that is not below F. Let d be a direction that 
is not parallel to F and that is in the interior of Cfg- Then F is in contact with 
supports for build direction d. □ 

We denote hy Uf the union of the sets Cfg, where G ranges over all facets 
that are not below F. Lemma 3 immediately implies the following two lemmas, 
which state when a front facet F is not in contact with supports. 

Lemma 5. Let d &e a direction on such that F is a front facet w.r.t. d. Lf d 
is not in the interior ofhiF, then F is not in contact with supports for build 
direction d. □ 



Lemma 6. Let d 6e a direction on such that F is a front facet w.r.t. d. 
Assume that (i) d is in the interior ofhiF, o,nd (ii) d is not in the interior 
of Cfg, for all facets G that are not below F. Then F is not in contact with 
supports for build direction d. □ 

The following three lemmas treat certain “boundary” cases, which involve 
directions that are parallel to F. 



298 Jorg Schwerdt et al. 



Lemma 7. Let d be a direetion on the great cirele Pp that is not eontained in 
any of the sets Cfg, where G ranges over all faeets that are not below F. Then 
facet F is not in contact with supports for build direction d. □ 



Lemma 8. Let G be a facet that is not below F. Let d be a direction on the 
great circle Pp, such that (i) either d is in the interior of the set Gpc, or (ii) d 
is in the interior of an edge of Gpo, and this edge is contained in Pp. Then 
facet F is in contact with supports for build direction d. □ 

Let A be the arrangement on defined by the great circle Pp and the 
boundaries of the sets Gpc, where G ranges over all facets that are not below F. 

Lemmas 7 and 8 do not consider directions on Pp that are vertices of A. 
For these directions, we will use the following lemma to decide if F is in contact 
with supports. 

Lemma 9. Let d be a direction on the great circle Pp, W := {r^d '■ x G F}, 
and X the set of all facets G such that (i) G is not below F, and (ii) the in- 
tersection of G and the interior of W is non-empty. Then F is in contact with 
supports for build direction d if and only ifX^t). □ 

3.2 The Facet Protection Algorithm 

The algorithm that computes a description of all build directions for which 
facet F is not in contact with supports is based on the previous results. 

Step 1: Following Lemma 1, we do the following. For each facet G oiV that 
is not below F, and such that (i) either F and G have more than one point in 
common, or (ii) F and G intersect in a single point, which is a vertex of one facet 
and in the interior of an edge of the other facet, do the following: Compute the 
boundary of the set Gpc as the set of all directions that are on or on the same 
side of Pp as the outer normal of facet F, and on or on the same side of Pq as 
the inner normal of facet G. 

Step 2: Following Lemma 2, we do the following. For each facet G oiV that 
is not below F, and such that (i) either F and G are disjoint, or (ii) F and G 
intersect in a single point which is a vertex of both facets, do the following. 
Compute the boundary of the set Cpo as the spherical convex hull of the (at 
most nine) directions dgj yf 0, where s and t are vertices of F and G, respectively. 
Step 3: Compute the arrangement A on that is defined by the great circle 
Pp and the bounding edges of all sets Gpo that were computed in Steps 1 and 2. 
Let B be the arrangement on consisting of all vertices, edges, and faces of A 
that are on Pp or on the same side of Pp as the outer normal of F. Give each 
edge e of S an orientation, implying the notions of being to the “left” and “right” 
of e. For edges that are not contained in the great circle Pp, these orientations 
are chosen arbitrarily. For each edge e of B that is contained in Pp, we choose 
the orientation such that the outer normal of F is to the “left” of e. For each 
edge e of B, compute the following three values: 



Protecting Facets in Layered Manufacturing 299 



1. le (resp. re), which is one if the interior of the face of B to the left (resp. 
right) of e is contained in some set Cfg that was computed in Step 1 or 2, 
and zero otherwise. If e is contained in Pp, then the value of Vg is not defined. 

2. ie, which is one if the interior of e is in the interior of some set Cfg that 
was computed in Step 1 or 2, and zero otherwise. 

Moreover, for each vertex v of B that is not on Pp, compute the value iy, which 
is one if v is in the interior of some set Cfg that was computed in Step 1 or 2, 
and zero otherwise. We will show later how Step 3 can be implemented. 

Step 4: Select all edges eoiB that are not contained in Pp, and for which lg = f 
and Tg = 0, or Ze = 0 and Vg = 1. Also, select all edges e of B that are contained 
in Pp, and for which Ig = 0 and ig = 0. 

By Lemmas 5 and 7, these edges define polygonal regions on that represent 
build directions for which facet F is not in contact with supports. 

Step 5: Select all edges e of B that are not contained in Pp, and for which Ig = 
Tg = 1 and ig = 0. Similarly, select all vertices v of B that are not on Pp, for 
which iy = 0, and having the property that Ig = rg = 1 for all edges e of B that 
have V as one of their vertices. 

By Lemma 6, these vertices and the interiors of these edges represent build 
directions for which facet F is not in contact with supports. 

Step 6: Let D be the set of all vertices of B that are on Pp. For each direc- 
tion d G D, decide if facet F is in contact with supports for build direction d. 
This can be done by using an algorithm that is immediately implied by Lemma 9. 

This algorithm reports a collection of spherical polygons, great arcs (the edges 
computed in Step 5), and single directions (the vertices computed in Steps 5 
and 6). It follows from the previous results that this collection represents all 
build directions for which facet F is not in contact with supports. We now 
consider Step 3 in more detail. 

After Steps 1 and 2, we have a collection of at most n — 1 spherical polygons, 
each having 0(1) edges. For each such edge e, let Kg be the great circle that 
contains e. Using an incremental algorithm, we compute the arrangement A! 
on of the 0(n) great circles Kg, and the great circle Pp. By removing from A' 
all vertices and edges that are strictly on the same side of Pp as the inner normal 
of facet F, we obtain an arrangement which we denote by B' . We give each edge 
of B' a direction. We will show how the values Ig, Tg, ig, and iy for all edges e and 
vertices v of B' can be computed. Since the arrangement B is obtained from B' 
by removing all vertices and edges that are not contained in edges of our original 
polygons CpG, this solves our problem. 

We introduce the following notation. For each vertex v of B', let Iy be the 
set of all facets C that are not below F and for which v is in the interior of Cfg ■ 
For each edge e of B', let Lg be the set of all facets C that are not below F 
and for which the interior of the face of B' to the left of e is contained in Cfg- 
Similarly, let Rg be the set of all facets G that are not below F and for which 
the interior of the face of B' to the right of e is contained in Cfg- Clearly, 

1. = 1 if and only if Iy yf 0, 

2. Ig = 1 if and only if Lg yf 0, 



300 Jorg Schwerdt et al. 



3. Te = 1 if and only if e is not contained in Pp and Re yf 0, and 

4. le = 1 if and only if Led Re ^ 0- 

The idea is to traverse each great circle that defines the arrangement B', and 
maintain the sizes of the sets Le, Re, and Le D Re- We number the facets 
of V arbitrarily from 1 to n. Let K be any of the great circles that define B' , and 
let r; be a vertex of B' which is on K. By considering all facets G that are not 
below F, we compute the set and store it as a bit-vector I of length n. By 
traversing this array /, we compute the number of ones it contains, and deduce 
from this number the value iy. 

Let e be an edge of B' which is contained in K and has r; as a vertex. By 

considering all edges of B' that have v as endpoint, we know which sets CpG 

are entered or left, when our traversal along K leaves v and enters the interior 
of e. We make two copies of the bit-vector /, and call them L and R. Then, 
by flipping the appropriate bits in the three arrays L, R, and /, we obtain the 
bit- vectors for the sets Le, Re, and LePRe, respectively, and the number of ones 
they contain. This gives us the values le, re, and ie- 

We now continue our traversal of the great circle K . Each time we reach or 
leave a vertex of B', we flip the appropriate bits in the arrays L, R, and I, and 
deduce the I, r, and i values. By the zone theorem [5], the running time of the 
entire algorithm is bounded by 0{n^). 

Theorem 1. Let V he a polyhedron with n triangular facets, possibly with holes, 
and let F be a facet ofV. In 0{n^) time, we can compute a description of all 
build directions for which F is not in contact with supports. □ 

In the full paper [14], we show that the set of all build directions for which 
facet F is not in contact with supports can have 12{n^) connected components. 
Hence, the algorithm of Theorem 1 is worst-case optimal. 

The algorithm can be extended to the case when k facets, F\,F 2 , . .. ,Fk 
have to be protected from supports. For each Fi, we run Steps 1 and 2 of the 
algorithm in Section 3.2. This gives us a set of spherically convex regions of total 
size 0{kn). We then compute the arrangement of these regions and the k great 
circles Ppi, which has size O(k^n^), and then traverse it in essentially the way 
described in the algorithm. The running time is bounded by 0{k^n^). 

3.3 Experimental Results 

We have implemented a simplified version of the algorithm of Theorem 1 . In this 
implementation, the boundary of the union hip is computed incrementally, i.e., 
the sets Cpo, where G ranges over all facets that are not below F, are added 
one after another in a brute force manner. The program outputs a collection of 
spherical polygons on such that for each direction in such a polygon, facet F 
is not in contact with supports. For details, see [13]. 

The program is written in C++ using LEDA [11], and run on a SUN Ultra 
(300 MHz, 512 MByte RAM). We have tested our implementation on real-world 
polyhedral models obtained from Stratasys, Inc.. Although the running time of 



Protecting Facets in Layered Manufacturing 301 



model 


n 


#F 


#Cfg 


\Uf\ 


min 


max 


average 


rd.yelo . stl 


396 


396 


99 


3.5 


0.01 


73 


16 


cover-5 . stl 


906 


906 


482 


8.2 


0.03 


558 


103 


tod21 . stl 


1,128 


1,128 


229 


3.8 


0.05 


281 


25 


stlbin2 . stl 


2,762 


1,330 


1178 


20.9 


0.25 


2,019 


363 


mj . stl 


2,832 


1,000 


641 


10.1 


0.26 


2,270 


146 



Table 1. n denotes the number of facets of the model; denotes the number 
of facets F for which we ran the program independently and averaged our bounds 
over; =f^CpG denotes the average number of facets G that are not below F; \Uf\ 
denotes the average number of vertices on the boundary of the union hip (note 
that this union may have no vertices at all); min, max, and average denote the 
minimum, maximum, and average time in seconds. 



our implementation is 0(n^) in the worst case, the actual running time is rea- 
sonable in practice. 

Table 1 gives test results for five polyhedral models. rd_yelo . stl is a long 
rod, with grooves cut along its length; cover-5. stl resembles a drawer for a 
filing cabinet; tod21 . stl is a bracket, consisting of a hollow quarter-cylinder, 
with two flanges at the ends, and a through-hole drilled in one of the flanges; 
stlbin2.stl is an open rectangular box, with a hole on each side and interior 
flanges at the corners; mj .stl is a curved part with a base and a protrusion, 
shaped like a pistol. 

3.4 Solving Problem 2 

For each facet F of V, we compute the sets Cfg for all facets G that are not 
below F. Let C be the arrangement on defined by the great circles Pp, and 
by the great circles that contain an edge of some set Gpo- Note that C is defined 
by 0{n^) great circles and, hence, consists of O(n^) vertices, edges, and faces. 

This arrangement C has the following property. Let / (resp. e) be any face 
(resp. edge) of C, and let d and d' be any two directions in the interior of / 
(resp. in the interior of e). Let T (resp. T') be the total area of all facets of V 
that are not in contact with supports, if V is built in direction d (resp. d'). Then 
T = T' . Problem 2 is solved by traversing each of the 0{n?) great circles that 
define C. 

Theorem 2. Let V he a polyhedron with n triangular facets, possibly with holes. 
In O(n^) time, we can compute a description of all build directions for which the 
total area of all facets that are not in contact with supports is maximum. □ 



4 Concluding Remarks 

We have shown that for a fixed facet F of the polyhedron V, a description of all 
build directions for which F is not in contact with supports can be computed in 
0{n^) time, which is worst-case optimal. A natural question is to ask for the time 
complexity of computing one build direction for which F is not in contact with 



302 Jorg Schwerdt et al. 



supports, or decide that such a direction does not exist. This problem appears 
to be closely related to the following one: Given n + 1 triangles Tq, Ti, T 2 , . . . , 
in the plane, decide if Tq is contained in the union of Ti, . . . , This problem 
is 3SUM-hard, see Gajentaan and Overmars [6]. Therefore, we conjecture that 
computing a single direction for which facet F is not in contact with supports 
is 3SUM-hard as well. 

The algorithms of Sections 3.2 and 3.4 have running time 0{n^) and O(n^), 
respectively. It would be interesting to design output-sensitive algorithms for 
solving these problems. 

We are not aware of any efficient algorithm that minimizes support structures 
for general three-dimensional polyhedra. Is it possible to compute, in polynomial 
time, such a build direction? 



References 

1. S. Allen and D. Dutta. Determination and evaluation of support structures in 
layered manufacturing. Journal of Design and Manufacturing, 5:153-162, 1995. 
293 

2. B. Asberg, G. Blanco, P. Bose, J. Garcia-Lopez, M. Overmars, G. Toussaint, 
G. Wilfong, and B. Zhu. Feasibility of design in stereolithography. Algorithmica, 
19:61-83, 1997. 292 

3. M. Bablani and A. Bagchi. Quantification of errors in rapid prototyping processes 
and determination of preferred orientation of parts. In Transactions of the 23rd 
North American Manufacturing Research Conference, 1995. 293 

4. L.-L. Ghen and T. C. Woo. Gomputational geometry on the sphere with application 
to automated machining. Journal of Mechanical Design, 114:288-295, 1992. 295 

5. M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf. Computational 
Geometry: Algorithms and Applications. Springer- Verlag, Berlin, 1997. 300 

6. A. Gajentaan and M. H. Overmars. On a class of O(n^) problems in computational 
geometry. Comput. Geom. Theory AppL, 5:165-185, 1995. 302 

7. P. F. Jacobs. Rapid Prototyping & Manufacturing: Fundamentals of StereoLithog- 
raphy. McGraw-Hill, New York, 1992. 291 

8. J. Majhi, R. Janardan, J. Schwerdt, M. Smid, and P. Gupta. Minimizing support 
structures and trapped area in two-dimensional layered manufacturing. Comput. 
Geom. Theory AppL, 12:241-267, 1999. 293, 293 

9. J. Majhi, R. Janardan, M. Smid, and P. Gupta. On some geometric optimization 
problems in layered manufacturing. Comput. Geom. Theory AppL, 12:219-239, 
1999. 293, 293 

10. J. Majhi, R. Janardan, M. Smid, and J. Schwerdt. Multi-criteria geometric opti- 
mization problems in layered manufacturing. In Proc. Ifth Annu. ACM Sympos. 
Comput. Geom., pages 19-28, 1998. 293, 293 

11. K. Mehlhorn and S. Naher. LEDA: a platform for combinatorial and geometric 
computing. Commun. ACM, 38:96-102, 1995. 300 

12. O. Nurmi and J.-R. Sack. Separating a polyhedron by one translation from a set 
of obstacles. In Proc. Ifth Intemat. Workshop Graph-Theoret. Concepts Comput. 
Sci. (WG ’88), volume 344 of Lecture Notes Comput. Sci., pages 202-212. Springer- 
Verlag, 1989. 293 



Protecting Facets in Layered Manufacturing 303 



13. J. Schwerdt, M. Smid, R. Janardan, and E. Johnson. Protecting facets in layered 
manufacturing: implementation and experimental results. In preparation, 1999. 
300 

14. J. Schwerdt, M. Smid, R. Janardan, E. Johnson, and J. Majhi. Protecting facets in 
layered manufacturing. Report 10, Department of Computer Science, University 
of Magdeburg, Magdeburg, Germany, 1999. 292, 300 

15. J. Schwerdt, M. Smid, J. Majhi, and R. Janardan. Computing the width of a 
three-dimensional point set: an experimental study. In Proc. 2nd Workshop on 
Algorithm Engineering, pages 62-73, Saarbriicken, 1998. 293, 293 



The Receptive Distributed 7r-Calculus* 

(Extended Abstract) 



Roberto M. Amadio^, Gerard BoudoP, and Cedric Lhoussaine^ 



^ Universite de Provence, Marseille 
^ INRIA, Sophia- Antipolis 
® Universite de Provence, Marseille 



Abstract. In this paper we study an asynchronous distributed 
TT-calculus, with constructs for localities and migration. We show that a 
simple static analysis ensures the receptiveness of channel names, which, 
together with a simple type system, guarantees that any migrating mes- 
sage will find an appropriate receiver at its destination locality. We argue 
that this receptive calculus is still expressive enough, by showing that it 
contains the Tri-calculus, up to weak asynchronous bisimulation. 



1 Introduction 

In this paper we study a simplified version of Hennessy and Riely’s distributed tt- 
calculus D7t [8]. This is a calculus based on the asynchronous, polyadic 7r-calculus, 
involving explicit notions of locality and migration, that is code movement from 
a location to another. In this model communication is purely local, so that 
messages to remote resources (i.e. receivers, in 7r-calculus terminology) must be 
explicitly routed. In such a model - as opposed to other ones like the Join 
calculus [6,7] or the 7Ti;-calculus [2] where messages transparently go through 
the network to reach their (unique) destination - the problem arises of how to 
avoid the situation where a message migrates to, or stays in a locality where no 
receiver will ever be available. In other words, we would like to ensure that any 
message will find an appropriate receiver in its destination locality, a property 
that we call message deliver ability. 

To solve this problem, we must ensure that a resource will be available when 
requested. An obvious way to achieve this is to enforce receptiveness - following 
Sangiorgi’s terminology [10] - of any (private) channel name, that is the property 
that at any time a process is able of offering an input on that channel name. 
However, the kind of receptiveness we are looking for is neither the “uniform” , 
nor the “linear” one described in [10]. Indeed, requiring each (private) name to 
be either uniform or linear receptive would result in a dramatic loss of expressive 
power. Then we are seeking for a formalization of something like the (unique) 
naming of a persistent “object” , which may encapsulate a state - we also wish 
to ensure the desirable property of unicity of receivers, which turns out to be 
needed for our expressiveness result. 

* Work partially supported by the RNRT project MARVEL. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 304—315, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



The Receptive Distributed 7r-Calculus 305 

It is actually not very difficult to design a simple inference system for stat- 
ically checking processes, in such a way that private names are (uniquely) re- 
ceptive inside their scope: the system is a refinement of the one of tti;, where 
we impose that recursive processes have a unique input parameter. Then for 
instance an input cannot be nested inside another, unless they are inputs on 
the same name. This is not enough to entail the message deliverability property 
however, since we not only have channel names, but also location names, and 
other names, called keys, that may be compared for equality but cannot be used 
as communication channels - the latter are crucially needed to retain the ex- 
pressive power, as we shall see. In particular, the restriction operation {vu)P of 
the TT-calculus must be refined, because we do not demand a location name or a 
key to be receptive - there is no notion of a receiver for such a name - and still 
we want such a name to be sometimes restricted in scope. Types seem to be the 
obvious solution for this particular problem. 

The type system we consider in this paper appears to be a simplified version 
of the simple type system of [8]: we use location types, that record the names and 
types of channels that may be used to communicate inside a locality, and we use 
“located types” for channels that are sent together with their location, instead 
of the existential types of [8]. We show the usual “subject reduction property”, 
which holds for the system for checking receptiveness too. Then we are able to 
prove that typed receptive processes do not run into the kind of deadlocks we 
wished to avoid, where a message never finds a corresponding receiver. 

The issue of avoiding deadlocks by statically inferring some properties has 
already been addressed, most notably by Kobayashi who studied in [9] means to 
avoid deadlocks in the 7r-calculus (see also [4]). His approach is quite different 
from ours, however: he uses a sophisticated type system where types involve 
time tags and where typing contexts record an ordering on the usage of names, 
whereas we only use a very simple information - the set of receiver’s names. 
Since we regard receivers as passive entities, providing resources for names, we 
are not seeking to avoid the situation where no message is sent to a receiver - 
in this situation the resource provider should be garbage collected. 

The question arises whether the nice properties of our receptive distributed 
TT-calculus are obtained at the expense of its expressivity. Indeed, receptiveness 
is both a strong safety property and a quite heavy constraint. We therefore 
establish a result that proves that expressive power is not lost: we show that the 
TTi-calculus of Amadio [2], in which the Join calculus can be encoded [3], may be 
translated into our receptive TTi-calculus, in a way which is fully abstract with 
respect to weak asynchronous bisimulation. 



2 Distributed Processes 



In this section we introduce our calculus of distributed processes, which may be 
seen as a simplification of the distributed 7r-calculus of Hennessy and Riely [8] , 
and as an extension of the tti - calculus of Amadio [2,3]. This is basically the 



306 



Roberto M. Amadio et al. 



asynchronous, polyadic 7r-calculus, with some primitives for spatial distribution 
of processes and migration. 

As usual, we assume given a denumerable set N of (simple) names, ranged 
over by a, b, c, . . . We will informally distinguish a subset of names, that we 
denote which are supposed to name localities - formally, there will be 

different kinds of types for names. Then the processes may also send and receive 
compound names, which are pairs of simple names that we write a@£, meaning 
“the (channel) name a used at location £”. We use u, v, w,. . . to denote simple 
or compound names, and a (possibly empty) vector of such names is written u . 
We shall often use the operation _@£ on (compound) names, defined by 



This operation is extended to sets of names in the obvious way. We use A,B, . . . 
to denote process identifiers, belonging to a denumerable set V, disjoint from M. 
These are parametric on names, and we shall write a recursive call as A(a; u). 
The intended meaning is that the name a is the only one that may - and must - 
be used as an input channel in the body of a recursive definition of A. The syntax 
of distributed processes involves types, but we defer any other consideration 
about types to the next section. Then the grammar for terms is as follows, 
where r is a type: 



For any w, we define its subject subj(w) as follows: subj(a) = a = subj(a@£) = 
subj(a: r). In {vw)P, the subject of w is the only name occurring in w which is 
bound. We make the standard assumption that recursion is guarded, that is in 
(recA(a; u).P){b; v) all recursive calls to A in P occur under an input guard. 
We shall say that P is closed if it does not contain any free process identifier. 

The need for compound names is explained by Hennessy and Riely in [8] , al- 
though they use more sophisticated “dependent names” . In our syntax we do not 
distinguish “the process located at £ ” from “the process P moving towards £ ” , 
written respectively £\P\ and £\-. P in [8], and {P}£ and spawn(.^, P) in [2]: 
both are denoted by [£ : : P] . Another difference with [8] is that our underlying 
TT-calculus is asynchronous, and we use explicit recursion instead of replication. 

In order to define a notion of bisimulation, we describe the behaviour of 
processes by means of a labelled transition system for the calculus. That is, 
processes perform actions and evolve to new processes in doing so, which is 
denoted P P' . The set of actions is given by: 




u, V .. . ■■■■= a I a@£ 

w ;:= u I a: r 

P, Q, R... ■■■■= au I a{u).P \ {P \ Q) \ [a = b]P,Q \ (vw)P 
I A{a-,u) I (rec A(a; u).P)(fo; F) | [£:\P] 



a 



T 



I uF I {UW)UV 



meaning respectively internal communication (there should be no confusion be- 
tween T as a type and r as an action), input of names v on the name u - if 
u = a@£, this means input on channel a at location £ - and output of names. 



The Receptive Distributed 7r-Calculus 307 



(out) 

au 


^0 




{in 


{ext) 


P ^ P' 


- (*) 


{P 



„ (uw)a 

(UW)P > P 



a{u).P ^ [v/u]P 

p ^ p' 

(**) 

(vw)P (UW)P' 



(cm) 



(cp) 



P p' ,Q^Q' 

p\Q^ (vw){p' I g') 

p ^ p' 



subj(w) n fn(Q) = 0 



(mt) 



P\Q^P'\Q 

p p' 



bn(a) n fn(g) = 0 



[a = a]P,Q A P' 

[rec A{a-,H).P/ A, a' ;u /a;u]P — 

(rec) 

(rec A(a; H).P){b-, P) P' 

(*) subj(w) € fn(a) — nm(subj(a)) 



(@) 



(m/) 



[£:■. P] ^ [£■.■. P'] 
Q^Q' 



a ^ b 



[a = b]P,Q^Q' 

p p' p' ^ Q' Q' =c g 

(Cff) 

P^Q 

{**) subj(w) ^ nm(a) 



Fig. 1. Labelled transition system 



some of them possibly being private. We also call subject of a, denoted subj(a), 
the name u whenever a = uv or a = (vw)uv . We define the sets fn(o!) and bn (a) 
of names occurring respectively free and bound in the action a in the obvious 
way (recall that in {va@£), a is bound while (. is free), and nm(Q;) denotes the 
union of these two sets. 

The rules of the transition system, given in Figure 1, extend the usual ones 
for the TT-calculus. In the rule {out), the term 0 is (:^a)((rec ^(a; ).a().A(a; ))(a)), 
that is a process which cannot perform any action. In the rules {in) and (rec), it 
is understood that the substitution operation involves some pattern matching, 
e.g. [a@b/x@y]P is [a/x, b/y\P. In the rule {ext), the action a must be an output 
action (otherwise (uw)a would not be an action). There are also rules symmetric 
to (cm) and {cp), which are omitted. In the rule {eg), the equivalence =a stands 
for syntactic identity up to renaming of bound names. In the rule (@) for located 
processes, we extend the operation to actions, as follows: 



a@£ = < 



u@£ P if a = up 

u@£ V if a = uv 

(vu@£){a'@£) if a = (vu)oi 
{va : T){a'@£) if a = (ua : r)a' 



308 



Roberto M. Amadio et al. 



where in the last two cases we use a-conversion to ensure that t ^ subj(rt) or 
a. ks one can see, any action of the located process \i :: P] is located - that 
is, its subject is a compound name -, but not necessarily at t. if for instance this 
was an action of a sub-process located at P , then its location is not modified, 
since = a@P , as one can easily check. As a matter of fact, it is easy to see 

that the following equation holds, where equality is the usual strong bisimilarity: 

This means that our operational semantics expresses the fact that distributed 
systems have a flat structure, like in [2,8], but unlike in the distributed Join 
calculus [7] or the Ambient calculus [5] where “domains”, or “sites”, may be 
nested. Moreover, there is only one “domain” with a given name, that is 

[lv.P]\[i-.-.Q]^[i-.-.P\Q] 

This makes a difference with the model of Ambients. These equalities allow us 
to interpret P] as “the process P migrating to £”. For instance the process 
[£:: P] \ Q] \ [£-. \ R\ behaves in the same way [£' ■.■. Q] \ [£:: P \ R\. Finally 
one can see that in the communication rule (cm) the two complementary actions 
must share the same subject u. This means that communication may only occur 
on the same channel (a if m = a or u = a@£), at the same location (£ if m = a@£). 
In other words, unlike in [2,7] where messages can transparently go through 
domains to reach the corresponding receiver, we have the “go and communicate” 
semantics of the D7r-calculus, which is also the one of [5]. Typically, no transition 
can arise from [£:: aw] j [£':: a(v).P] \i £ ^ £' , and one has to explicitly move 
and meet in order to communicate, like in \£ :: \£' :: azT]] j \t' :: a(v).P]. 

3 A Simple Type System 

In the polyadic 7r-calculus, the names are normally used according to some typing 
discipline, and the processes are checked to obey this discipline. We will do the 
same here, requiring the processes to conform to some typing assumption about 
names, since this will be needed to ensure the message deliverability property we 
are seeking for. The types we use generalize the usual sorts of the 7r-calculus, and 
in particular channel sorts will be types. We also have to assign types to location 
names, and to compound names. The idea here is that a location type should 
record the names and types of channels on which communication is possible 
inside a locality. Therefore, a location type is just a typing context of a certain 
kind, as in [8], while the type of a located channel is a “located channel type” 7® - 
which is simpler than an existential type, as used in [8]. We also introduce a type 
val, for names that may be tested for equality, but are not used as communication 
channels or location names - we allow the latter to be compared too. Then the 



types (for names) 


are as follows: 




r, a ... 


::= C 1 1 7® 


types 


c 


::= val j 7 


values and channel types 


7, 6... 


::= Ch{Ti, . . . ,T„) 


channel types 


%j}, cj>... 


:■= {ai : 71 , . . . ,o„ : 7 ^} 


location types 



The Receptive Distributed 7r-Calculus 309 



a : val \~i a : val 



: 7 a : 7 



■0®^ \~t' (- • '0 



u : r , <l>\-eU : a 



a@£ : 7 a@£ : y' 






Fig. 2. Type system for names 



where in location types the a^’s are distinct names, and the order of items : yt 
is irrelevant. The typing judgements have the form S' P, meaning that P, 
when located at i, uses the names as prescribed by the typing context 'P. A 
typing context P is a, pair (Ptyp,Pioc) of mappings from a finite subset dom(!f') 
of AfU 7^, satisfying the following constraints - where we write i ^ P to mean 
that £ is neither in the domain of P nor in the types assigned by P: 

1. Ptyp assigns to each x € dom(!7') a value or channel type (i.e. a ( type). 

2. Pioc assigns to each x G dom(!?’) a finite set of names, such that £ G Pioc{x) ^ £ ^P. 

3 . if Ptyp{x) = val or X £ V then Pioc{x) = 0. 

A context P such that dom(if') = {ii, . . . , Xn} with Ptyp{xi) = Q and Pioc{xi) = 
Li will be written: 



This means that the name Xi is used at the locations contained in Lj, uniformly 
with type Q. We abbreviate : y as x@£ : 7, and we simply write Xi : (i 

if Li = 0. In the typing rules we use the notation P, P for the context defined as 
follows: 



Notice that this context is undefined if Ptyp{x) yf Ptyp{x) for some x G dom{P,P), 
or if there exist x G dom(!L, P) and £g {Pioc{x)UPioc{x)) such that ^Gdom(!L, P) 
or £ occurs in some type assigned by P or P. To state the typing rules for P l~£ P, 
we first need to introduce a system for establishing sequents of the form 



that is for computing the types assigned to names by the context P at the current 
location £. The rules for inferring these judgements are given in Figure 2, where 
we write tp®£ for ai@£ : 71 , . . . , an@£ : 7 ^ if '0 = {oi : 7i, ■ ■ • , On : In}- The 
typing rules for processes are collected in Figure 3, where we use the following: 



a:i@Li : 01 , . . . , Xn@L-i-i : 0^ 



dom(!7', P) = dom(i7') U dom(^) 




(P,P)ioc{x) = Pioc{x) U Ploc(x) 



P \-£ Ul ■. Tl, . . . ,Un ■ Tn 



310 



Roberto M. Amadio et al. 



am-. Ch{7), <F,<P\-eau 



a@£: Ch{T) , $\-e P , >P\-eu:T 
a@£: Ch{7), <P a{u).P 



<P^eP\Q 

a@£ : "f , \-£i P 

<P h^/ (i/a@£)P 

£' 

[£..P] 



'P'riP , 'P'riQ 



’P'te [a = h]P,Q 



a : val , 'P \~e P 



P \-{ iya : val)P 



a@£ : y , P \~e P 
>P l-£ (va)P 

ip@£ ,<P^t P 
'll) , P \~i' (u£ : ip)P 



P \~i u : T 

a @£ : 7 , A : Ch^-y, 't), P , $\~e A(a; u) 



a@£ : 7 , A : Ch(y, ~r) , P , $ P , P \~iPl , P' \~i U : 
bm : y , <P' , (p \~e (rec A(a; u).P){b; 'v) 

(*) 'Ptyp{a) = val = 'Ptypib) or a, b ^ P. 

Fig. 3. Type system for terms 



Convention. In the rules for the binding constructs, that is input, restriction 
and recursion, we implicitly have the usual condition that the bound names do 
not occur in the resulting context. 

Let us comment some of the rules of the system. We see that a simple name 
may only be used if it is of type val, or if it is a channel located at the current 
location, or if it is a location name. In this latter case, its type is the collection of 
channel names, together with their type, located at that location. A compound 
name may be used at any locality, but with a located type. In the rule for name 
comparison, we see that we can compare both names of value type and location 
names since £ ^ <F is true if £ is used as a locality in if'. To type [£:: P] one must 
be able to type P at locality £, while the resulting current locality of \£:\ P] is 
arbitrary. Finally in the rules for recursive processes, we note that, to ensure that 
the parameters are used in a consistent way, we assign to the process identifier 
a (channel) type. 

Our type system is very close to the simple type system for dtt presented 
in [8], except that we use located types 7® instead of existential types. Our 
main result about typing is the standard “subject reduction property” . This 
result is needed to establish the message deliverability property (Theorem 4.4). 



The Receptive Distributed 7r-Calculus 311 



ahP ^ IhP , I'hQ 

a ^ nm{u) 7n7 =0 

Ih au a Ih a{u).P 7 , 7' Ih (P | Q) 

u,I\\-P i\\-p dom(i/>)@7, 711- P 7II-P,7II-Q 

7 Ih (vu)P I Ih (va : val)P I Ih (i>£ : tp)P 7 Ih [a = b]P, Q 

alhP 7lhP 

(*) 

a Ih hl(o; ?I) b Ih (rec hl(a; m).P)( 6; 1; ) 7@7lh[7::P] 

(*) { a I a, a@£ £ 1} = $ 

Fig. 4. Well-formed terms 



Theorem 1 (Preservation Of Typing). IfW\~iP and P P' then 

(i) if a = uv then W = a@£' : Ch(r),d> with a@i' = u or a = u and £' = i, and 
'P' , P \~i P' where P' h^/ ~v : ~r , 

(ii) if a = {vw)iLv then P = a@7': Ch(r),P' , d> with a@P = u or a = u and 
I' = i, and there exists P” such that P' ,P” h^/ ~v : ~f and P' ^P" h^ P' , 

(iii) if a = T then P \~i P' . 

4 Interfaces and Receptive Processes 

It is easily seen that the type system does not guarantee the kind of safety prop- 
erty we are looking for: we can obviously type non-receptive processes, and we 
can also type processes sending a message that will never find a correspond- 
ing receiver, like for instance (va)a. Then we introduce an inference system for 
checking “well-formedness” of processes. Basically, to be well- formed a process 
must not contain nested inputs on different names. Moreover, any input must 
be involved in a recursive process that makes the receiver reincarnate itself, pos- 
sibly in a different state, after being consumed. In addition, we shall impose, 
as in 7Ti [2], that there is a unique receiver for each name, that is, two parallel 
components of a process are not allowed to receive on a same name. Last but 
not least, we demand that to restrict the scope of a name of channel type, we 
know for a fact that a resource is provided - that is, a receiver exists - for that 
name. 

The well-formed processes are the ones for which a statement 7 Ih P, that 
is “P is well-formed with interface 7”, can be proved. In this statement the 
interface 7 is a finite set of names on which a process may perform an input. 
We present the rules for well-formedness in Figure 4, where we use the same 
convention as for the typing regarding the binding constructs, and where, as 



312 



Roberto M. Amadio et al. 



usual, a set / = {u \, . . . , u„} is represented as the list u\, . . . , of its elements, 
and union is written 1,1'. Our first result about well-formedness is, again, a 
“subject reduction property”: 

Proposition 1 (Preservation of Well-Formedness). If I \\- P and P P' 

then 

(i) / Ih P' if a = T or a = uv , 

(ii) if a = {vw)Tfu and J = {u \ 3i. Wi = u} U { a@i \ 3i. Wi = i: ip Sz a G 
dom('i/:) } then I, J Ih P. 

We can now establish the receptiveness property for our distributed calculus. 
To state this property, let us define the predicate P I u, meaning “P may 
perform an input with subject u”, that is: 

Pin 4»def 3v3P'.Pl^P' 

Proposition 2 ((Receptiveness). Let P he a closed well-formed term. Then: 

(i) if I \\- P then P [u iff u G I, 

(ii) if P [u and P P' then P' J, u. 

This result suggests the denomination “distributed (asynchronous) receptive 
TT-calculus, with unique receivers”, in short the D7r[-calculus, for the set of well- 
formed closed processes, which is closed by labelled transitions. Similarly we 
call 7r[, that is “the receptive tti - calculus”, the sub-calculus where we do not use 
any locality based feature. 

We now turn to the issue of message delivery. We aim at showing that, if a 
message is sent (at some locality) on a channel of a known scope in a well-formed 
and typed process, then the process contains a receiver for this message. Let us 
denote by P J, m the fact that P performs an output action with subject u, and 
let us define 

f P ^ Q and 

'il', t. 'P Gi P P Gi Q and 
V7. / II- P 7 Ih Q 

As a preliminary result, we show that any process that is able to send a message 
is equivalent to another having a special form: 

Lemma 1. P [a iff P cs (vw){(av \ R) for some w, IJ and R with a ^ subj(w). 
P I a@£ iff P cs (vw){[£ ::av] \ R) for some w, If and R with a ^ subj(w). 

Now we can prove our main result, where ^ stands for 
Theorem 2 (Message Deliverability). Let P he a closed well-formed and 
typed process with 7 Ih P and P Gi P. If P ^ P' then 
(i) if P' ~ (iriju){afJ I R) with o G 7 U subj(w) then R I a or R ( a@£. 

(ii) if P' ~ (iyiju){[£' ::afj] \ R) with a@£' G I or a G subj(u;) then R J, a@£' or 
£' = £ and R I a. 

Note that this result does not hold for untyped terms. For instance we have 
Ih {na: val)a or Ih (i/a: 0)a, and these terms contain a message that cannot be 
delivered. 




The Receptive Distributed 7r-Calculus 313 



5 Encoding the TTi-Calcnlus 

In this section, we show that there is a translation from the 7r-calculus to D7 t[. 
It is shown in [3] that the joined input of the join-calculus [6] can be defined in 
the TTi-calculus up to weak asynchronous bisimulation. On the other hand, it has 
been shown in [6] that there is a fully abstract translation of the asynchronous 
TT-calculus in the join-calculus. Therefore, if we can translate the 7ri-calculus 
in the D7r[-calculus -- or rather in the 7r[-calculus -, we can reasonably claim 
that the nice properties of the D7r[-calculus are not obtained at the expense 
of its expressive power. We now give such a translation, that we show to be 
fully abstract with respect to a refined form of asynchronous bisimulation [1,2], 
defined as follows: 

Definition 1 (Asynchronous Bisimulation). A symmetric relation S is an 
asynchronous bisimulation if P S Q implies 

(i) there exists I such that I \\- P and I Ih Q, and 

(ii) if P ^ P' then Q ^ Q' for some Q' such that P' S Q' , 

(iii) if P , pi u ^ I and subj(w;) n fn(Q) = 0 then Q , qi 

for some Q' such that P' S Q' , 

(iv) if P “ ^ > P' then either Q “ ^ > Q' with P' S Q' , or Q ^ Q' with P' S {Q' \ 
R) where R = dv if u = a and R = [£ ::dv] if u = a@£. 

We denote with the greatest asynchronous bisimulation. The notion of weak 
asynchronous bisimulation is obtained by replacing everywhere transitions with 
weak transitions. We denote with the greatest weak asynchronous bisimula- 
tion. 

As a source calculus, we will consider the TTi-calculus with guarded recursion. 
The idea of the encoding is quite standard and simple (see [2,3,6] for similar 
encodings). We turn any message on a channel a into a request to a channel 
manager CM (a) for a, sending the arguments of the message together with a 
key out. Symmetrically, we turn any input on a into a request to CM (a), sending 
a key in and a private return channel to actually receive something. The channel 
manager will filter the messages according to the keys, and act as appropriate. 
However, there is an attack on this encoding which compromises abstraction: 
the environment can send a request for input to the channel manager. We then 
authenticate the requests for input by introducing a restricted key ina for every 
channel manager (of a) which is known only by the process that can actually 
input on the channel a. To formulate our encoding, we will use several notational 
conventions and abbreviations. Let us first define the identity agent, and recall 
the input once construct of [2], given by 

Ida =def {recA{a-).a{u).{au \ T(a;)))(a;) 
a{u)-.P =def a{u).{P I Ida) 



314 



Roberto M. Amadio et al. 



Then 



rec A{a\ u).P 


stands for 


(rec A(a; H).P)(a; u) 


(va)P 


for 


(va){P 1 Ida) {a not free in P) 


a{u,.,v) 


for 


(vc){a{u,c, v) Ida) 


a{u, U).P 


for 


a{u,c,P).P (o not in P) 


(recA(; 6).P)(; o') 


for 


(va"){recA{a-, b).{Ida \ P')){a”-,~c 



where in the last clause P' is P where every free occurrence of A(; 7?^) is replaced 
with {va')A{a'\ ~c). We shall also need a kind of internal choice (similar to the one 
given in [2] by means of “booleans”), P ®aQ where a is a channel name of type 
Ch(r, val, val), and P and Q are such that a Ih P,Q. This is defined as follows - 
where c yf c', [x yf y]P, Q is [a; = y]Q, P and, as usual, all the introduced names 
are fresh: 

P (BaQ ~de! (VC. val){a(., c, c) I (vc' : val)a{-, c, c') \ rec A{a- ).a{u, x,y). 

[x A c](a(w,®,y) I ^(a;)) , 

recA'{a-, ).a{u,x',y')\x' y^ c\{a{u,x' ,y') \ A'(a; )) , 

[x = y]P,Q) 

It is easy to see that this is a well- formed term, with interface {a}. In the 
following translation, the new names are assumed to be fresh. In particular, for 
every name a we assume a fresh name in a (that is, not in the set of names 
of 7Ti). The name iua is used as the key of the channel a. We translate a well- 
formed term I \\- P (where P respects some sorting) of the TTi-calculus, where 
I = {«!, . . . , a„}, in the following process of the D7r[-calculus: 

[7 Ih Pj = (V iUai : val) ■ ■ ■ (v ma„ : val){CM{ai', ) | • • • | CM{an ', ) | [P]) 

which turns out to be also well- formed in the context I . The type of a chan- 
nel name, say Ch^r), is transformed into Ch{val, r, Ch{r) , val , val)\ the first 
argument is the input/output key of the channel, then we have the arguments 
of the message to be delivered, followed by the type of the return channel to 
which they are actually sent, and then we have two keys for internal choice. The 
channel manager CM (a; ) is given by: 

CM{a\) — rec A(a; ).a(y, 6 , s, ci, c'i).a(i, d,r, 02,02). 

L ©a if j = ina then L else if i yf iUa then L else (r b \ A{a ; )) where 
L = A{a;) I a{j, 6,s,oi,oi) | a{i, d, 0,02,03) 

The process |P] is defined as follows: 

[d&] = d(_, & 

Ia(T).P] = (vr){a{ina,-,r,.,.) \ r('b):[Pl) 

I Q1 = HP} I IQl) 

l[a^b]P,Qj = [a = b]lPHQj 

l(va)Pj = (va)(vina. val){CM{a\) \ |P]) 

|(rec A(a; b).P){~c; d)| = (rec A(; a, irC, &)-I-P])(; d) 

[A(a; 6)1 = A(; a, 6) 



The Receptive Distributed 7r-Calculus 315 



Our main result about this translation of the TTi-calculus into the receptive tti- 
calculus is that it is fully abstract with respect to weak asynchronous bisimula- 
tion: 

Theorem 3. Assume that I \\- P and I \\- Q in the iTi-calculus. Then 
P^aQ ^ [/lb P] [/ lb 01 

The proof, which is to be found in the full version of the paper (together with 
examples showing the expressivity of our calculus with respect to distributed 
systems) , goes through an analysis of the various actions that may be performed 
by |/ lb PJ. We then show that an exact simulation of the behaviour of tti 
processes is provided by the translation. 



References 

1. R. Amadio, I. Castellan: and D. Sangiorgi, On bisimulations for the asyn- 
chronous calculus, CONCUR’96, Springer Lect. Notes in Comp. Sci. 1119 (1996) 
147-162. 313 

2. R. Amadio, An asynchronous model of locality, failure, and process mobility. 
In Proc. COORDINATION’97, Springer Lect. Notes in Comp. Sci. 1282 (1997). 
Extended version appeared as Res. Report INRIA 3109. 304, 305, 305, 306, 308, 
308, 311, 313, 313, 313, 314 

3. R. Amadio, On modeling mobility, Journal of Theoretical Computer Science, to 
appear (1999). 305, 305, 313, 313 

4. G. Boudol, Typing the use of resources in a concurrent calculus, Proc. ASIAN 
97, Springer Lect. Notes in Comp. Sci. 1345 (1997) 239-253. 305 

5. L. Cardelli and A. Gordon, Mobile ambients. In Proc. FoSSaCS, ETAPS’98, 
Springer Lect. Notes in Comp. Sci. 1378 (1998) 140-155. 308, 308 

6. C. Fournet and G. Gonthier, The reflexive CHAM and the join-calculus. In 
Proc. ACM Principles of Prog. Lang. (1996) 372-385. 304, 313, 313, 313 

7. C. Fournet, G. Gonthier, J.-J. Levy, L. Maranget, and D. Remy, A calculus 
of mobile agents. In Proc. CONCUR’96, Springer Lect. Notes in Comp. Sci. 1119 
(1996) 406-421. 304, 308, 308 

8. M. Hennessy and J. Riely, Resource access control in systems of mobile agents, 
Techn. Report 2/98, School of Cognitive and Computer Sciences, University of 
Sussex (1998). 304, 305, 305, 305, 306, 306, 306, 308, 308, 308, 310 

9. N. Kobayashi, a partially deadlock-free typed process calculus, ACM TOPLAS 
Vol. 20 No 2 (1998) 436-482. 305 

10. D. Sangiorgi, The name discipline of uniform receptiveness. In Proc. ICALP’97, 
Springer Lect. Notes in Comp. Sci. 1256 (1997) 303-313. 304, 304 



Series and Parallel Operations on Pomsets 



Zoltan Esik^* and Satoshi Okawa^ 

^ Dept, of Computer Science, A. Jozsef University 
Aradi v. tere 1., 6720 Szeged Hungary 
esikOinf .u-szeged.hu 

^ Department of Computer Software, The University of Aizu 
Aizu-Wakamatsu-City, Fukushima 965-8580 Japan 
okawa@u-aizu .ac.jp 



Abstract. We consider two-sorted algebras of pomsets (isomorphism 
classes of labeled partial orders) equipped with the operations of series 
and parallel product and series and parallel omega power. The main 
results show that these algebras possess a non-finitely based polynomial 
time decidable equational theory, which can be axiomatized by an infinite 
set of simple equations. Along the way of proving these results, we show 
that the free algebras in the corresponding variety can be described by 
generalized series-parallel pomsets. We also provide a graph theoretic 
characterization of the generalized series-parallel pomsets. 



1 Introduction 

Partially ordered structures, and in particular isomorphism classes of labeled 
partial orders, called pomsets for Partially Ordered MultiSETs, have been used 
extensively to give semantics to concurrent languages [14,5,13,7,1,2], both in 
the operational and denotational (ordered and metric) framework, and to Petri 
nets [9,20,12,21,19], to mention a few references. (Pomsets are called partial 
words in [9].) The paper [6] deals with the relation between pomsets and 
Mazurkiewicz traces. For automata accepting pomset languages, i.e., sets of 
pomsets, we refer to [11]. The partial order is usually interpreted as a causal 
dependence between the events. The event structures of Winskel [22] are pom- 
sets enriched with a conflict relation subject to certain conditions. In [5], the 
computations determined by event structures are modeled by pomsets. Some 
authors only allow for finite pomsets, while others also use pomsets of infinite 
size, but very often place some restrictions on the partial order or labeling. There 
is an extensive literature dealing with the pomsets themselves, and pomsets in 
relation to languages, see, e.g., [17,8,16,15,3,4]. 

A wide variety of operations has been defined on pomsets. The definitions 
are motivated by the intended applications. However, there are two operations 
that play a central role, the series and the parallel product. The series product 
P ■ Q of two pomsets P and Q is obtained by taking the disjoint union of P 

* Partially supported by the grants OTKA T30511 and FKFP 247/1999. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 316—328, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Series and Parallel Operations on Pomsets 317 



and Q and making each vertex of Q larger than any vertex of P. The parallel 
product P 0 Q is just disjoint union. There are also other names used in the 
literature, e.g., concatenation and concurrence or concurrent composition [14,8], 
sequential and parallel composition [5,7,1], sequencing and concurrency [6], con- 
catenation and disjoint union [15], sequential or serial product and parallel or 
shuffle product [9,3]. The motivation for using the term shuffle is due to that in 
the simplest language model of concurrency parallel composition is modeled by 
shuffle. A mathematical justification of the term is given by the result, proved 
independently in [16] and [3], that languages equipped with concatenation and 
shuffle satisfy the same set of equations as pomsets equipped with series and 
parallel product. In fact, these equations can be captured by the bisemigroup^ 
axioms expressing that both operations are associative and parallel product is 
commutative. The series-parallel pomsets, i.e., those pomsets that can be gener- 
ated from the singletons by series and parallel product, have a well-known graph 
theoretic characterization [9,17]. 

In this paper, we will consider non-empty countable pomsets. Since pomsets 
model the behavior of processes that can be executed in at most tu steps, we 
restrict the operation of series product to instances P ■ Q where P is finite. (See 
also the last paragraph of Section 8. We could also restrict ourselves to pomsets 
which have a linearization to an w-chain, this would not alter our results.) Given 
a pomset P, we also define the series and parallel omega powers = P ■ P ■ . . . 
and = P 0 P 0 . . . Thus, P“ and P^^^ solve the fixed-point equations 
X = P ■ X and Y = P , respectively. The series omega power P“ is used 
here only when P is finite. The pomsets P“ and P*^“^ represent (sequential and 
parallel) infinite looping behaviors. 

Since some of the operations require that an argument is a finite pomset, 
we will work with two-sorted algebras of pomsets. The domains corresponding 
to the two sorts, the finite and the infinite sort, consist of the finite non-empty 
and the countably infinite pomsets whose action labels are in a given set A. 
We present a simple infinite equational basis E of these algebras (Corollary 2) 
and show that the equational theory is not finitely based (Theorem 8) . We also 
show that the equational theory is decidable in polynomial time (Theorem 9). 
Along the way of establishing these results, in Theorems 3 and 7 we give a 
concrete description by generalized series-parallel pomsets of the free algebras 
in the variety V axiomatized by the equations E. We also give a graph theoretic 
characterization of the generalized series-parallel pomsets (Theorem 6) . 

The series omega power operation in conjunction with series and parallel 
product, has already been studied in the recent paper [4]. Some of the results of 
the present paper extend corresponding results in [4] . 



^ The empty pomset is neutral element for both series and parallel product. Bisemi- 
groups with a neutral element are called double monoids in [9], dioids in [.5], and 
bimonoids hr [3]. 



318 



Zoltan Esik and Satoshi Okawa 



2 Pomsets 



We will consider finite non-empty and countably infinite posets P = (P, <p,ip) 
whose elements, called vertices, are labeled in a set A of actions, or labels, 
so that £p is a function P ^ A. An isomorphism of A-labeled posets is an 
order isomorphism which preserves the labeling. An A-labeled pomset, or just 
pomset [14], for short, is an isomorphism class of A-labeled posets. Below we 
will identify isomorphic labeled posets with the pomset they represent. 

Some notation For each non-negative integer n we denote the set {1, . . . , n} 
by [nj. 

Suppose that P = (P,<p,£p) and Q = (Q,<q,^q) are pomsets. We define 
several operations, some of which will require that P is finite. 

Series product. If P is finite, then the series product of P and Q is con- 
structed by taking the disjoint union of P and Q and making each vertex of Q 
larger than any vertex of P. Thus, assuming without loss of generality that P 
and Q are disjoint, P ■ Q = {P LI Q, <p-Q,ip-q), where, for any u,v £ P L Q, 



u ^P Q V 4L {u € P and v G Q) or u <p v or u <q v 



^pq{u) 



( £p{u) \i u G P 
\ £q{u) a u gQ. 



Parallel product. The parallel product of P and Q is constructed as the 
disjoint union of P and Q. Thus, P®Q = (PU Q, <ptg)Q, £p 0 q), where we again 
assume that P and Q are disjoint. Moreover, for any u,v G P L Q, 



u ^P^Q f V or u <Q V. 

The function ip^Q is defined as ip.q above. 

Series omega power. Assume that P is finite. The series product of uj 
copies of P is called the series omega power of P. Thus, denoting = {1, 2, . . .}, 
P“ = (P X N, <p‘^,ip‘^), where 

(u,i) <p>^ {v,j) i < j or (i = j and u <p v) 

£p<^ ((m, i)) = £p{u), for all (m, i), (v,j) G P x N. 

Parallel omega power. The parallel omega power of P is the disjoint 
sum of P with itself w-times. Thus, p(“) = (^p x N, <p(^:),£p(^)), where 

( m , i) < p(„) (v,j) = j and u <p v 

£pM ((u, i)) = £p(u), for all (u, i), (v, j) G P x N. 

Below we will sometimes write ojP for P^^\ Similarly, for each integer n > 1, 
we define nP to be the n-fold parallel product of P with itself. 

Equipped with these operations, the (countable non-empty) A-labeled pom- 
sets form a two-sorted algebra a;Pom(A) = (Pomp (A), Pom/ (A), ), 

where Pomp (A) and Pom/ (A) denote the collections of all finite non-empty 
and countably infinite A-labeled pomsets, respectively. We would like to know 
the equations satisfied by the algebras o/Pom(A). 



Series and Parallel Operations on Pomsets 319 

Proposition 1. The following equations hold in any algebra o;Pom(A). 

X ■ {y u) = {x ■ y) ■ u (1) 

u 0 (w 0 w) = (m 0 w) 0 w (2) 

M 0 = t; 0 M (3) 

{x-yY =x-{y xY (4) 

{x'^Y=x‘^, n>2 ( 5 ) 

u 0 um = oju ( 6 ) 

w{u® v) = LOU® OJV (7) 

lOU 0 LOU = LOU (8) 

lo(lou) = wu , (9) 



where x,y range over finite pomsets and u,v,w range over finite and infinite 
pomsets. 

Recall from the Introduction that one-sorted structures equipped with oper- 
ations • and 0 satisfying the equations (1), (2) and (3) are called bisemigroups. 

Some subalgebras of tt>Pom(A) are also of interest. Let SP^i denote the set 
of all (finite) pomsets generated from the singleton pomsets corresponding to 
the elements of A by the operations of series and parallel product. Pomsets in 
SPa are called series-parallel. Moreover, let denote the set of all infinite 

pomsets generated from the singletons by the operations of series and parallel 
product, and series and parallel omega power, and let SP)^ denote the set of all 
infinite pomsets that can be constructed from the singletons by the two product 
operations and the series omega power operation. It follows by a straightforward 
induction on the number of applications of the operations that any pomset in 
gpw,(oj) ^ linearization to an o;-chain. The elements of SP‘^’*'“^ are called 
generalized series-parallel pomsets. We usually identify any letter a G A with the 
corresponding singleton pomset labeled a. For further reference, we recall 

Theorem 1. Grabowski [9] SP^i is the free bisemigroup generated by A. 

Theorem 2. Bloom and Esik [4] The algebra (SPa, SPa, •, ®Y ) is freely gen- 
erated by A in the variety of all algebras equipped with the two product operations 
and the series omega power operation satisfying the equations (1) - (5). 

More precisely, in [4] the parallel product of two pomsets was defined only if 
both pomsets are finite or both are infinite, causing only a little change in the 
above result and its proof. 

3 Freeness 

Suppose that C = (Cf, G/, •, 0,“ ) is a two-sorted algebra, where the bi- 

nary product operation • is defined only if its first argument is in Gf and the 



320 



Zoltan Esik and Satoshi Okawa 



unary “ operation only if its argument is in Cp- The arguments of the other 
binary product operation ® as well as the argument of the unary operation 
may come from both Cp and C/. The result of applying the “ or operation 
is always in Cj. The result of applying a binary operation is in Cj iff one of 
the arguments of the operation is in Cj. We say that C satisfies the equations 
(1) - (9), or that these equations hold in C, if these equations hold in C when ar- 
bitrary elements of Cp and Cj are substituted for the variables with the proviso 
that elements of Cp are substituted for the variables x^y oi finite sort. Let V 
denote the variety of all two-sorted algebras C equipped with the above opera- 
tions satisfying the equations (1) - (9). Note that the equation a; • = x“ also 

holds in V, where x is a variable of finite sort. Also, note that it is sufficient to 
require the power identities (5) only for prime numbers n. 

Theorem 3. The algebra wSPa = (SPa, freely gener- 

ated by A in V. 

Proof. In our argument, we make use of the rank of a pomset P € SPa U 
denoted rank (P), defined to be the smallest number of applications of 
the operations by which P can be generated from the singletons. We will also 
make use of normal representations of A-labeled pomsets. 

Suppose that P is an A-labeled pomset. A series normal representation of P 
is a representation P = Pi- .. .-Pk or P = R\- . . .-RmfSi-. . .•S'„)“, where k,n>l 
and m > 0, and where the pomsets Pi,Rj and St, called the components of the 
representation, are serially indecomposable, i.e., none of them can be written 
as the series product of two pomsets.^ Moreover, we require that Rm Sn if 
m > 1, and that Si ■ . . . ■ Sn cannot be written as a non-trivial series power 
of any pomset, or equivalently, as {Si ■ . . . ■ S'i)"/*, where f is a proper divisor 
of n. (Indeed, otherwise the representation could be simplified by the identity (4) 
or (5).) 

A parallel normal representation of P is P = niPi 0 ... 0 UkPk, where 
fc > 1, S {1, 2, . . . ,Lo}, for all i G [k], and where the Pi are pairwise distinct 
and connected, i.e., parallelly indecomposable. Again, the pomsets Pi are called 
the components of the representation. If a pomset P is connected, a normal 
representation of P is a series normal representation of P. If P is disconnected, 
a normal representation of P is a parallel normal representation. 

Note that if P is a singleton, or more generally, if P is serially (parallelly, 
respectively) indecomposable, then its unique series (parallel) normal represen- 
tation is P. It is clear that each A-labeled pomset has at most one series and, 
up to a rearrangement of the components, at most one parallel normal repre- 
sentation. Thus, when it exists, we can refer to the series or parallel normal 
representation of a pomset. Suppose that P € SPa U Then one can ar- 

gue by a induction on rank (P) to prove that P has both a series and a parallel 
normal representation. Moreover, all components of the series and parallel nor- 
mal representation of P are in SPa U and the rank of any component 

of the normal representation of P is strictly less than the rank of P. 

^ Recall that all pomsets considered in this paper are non-empty. 



Series and Parallel Operations on Pomsets 321 



To complete the proof of Theorem 3, suppose that we are given a two-sorted 
algebra C = (Cf, C/, •, ) in V together with a function h : A ^ Cp. We 

need to show that h extends to a homomorphism ^ C, 

where hp : SPa — > Cf and h\ : — > Cj. Of course, h\, is just the 

unique bisemigroup homomorphism SPa ^ Cf extending hp which exists by 
Theorem 1. But we need to define h\{P) for each P G SP‘j^’*'‘^\ This is done 
by induction on r = rank(P). (The reader will probably be relieved that from 
now on we will omit the indices I and F.) We start the induction by r = 0 in 
which case there is nothing to prove, since there is no pomset in whose 

rank is 0. Suppose that r > 0. If P is connected, then let P = Pi • . . . • Pfc or 
P = Pi'. . .-Rm-iSi'. . .-Sn)^ be its series normal representation. Since the rank of 
each component is less than r, it makes sense to define h^{P) = h'^{Pi)-. . .■h'^{Pk), 
in the first case, and h^{P) = h^{Ri) ■ . . . ■ h^{Rm) ■ ■ . . . ■ h^{Sn))‘^, in 

the second. If P is disconnected, then take its parallel normal representation 
P = fciPi ® ® fc„P„. We define h^P) = kih^Pi) ® . . . 0 knh^{Pn). By the 

preceding observations and the associativity of the product operations and the 
commutativity of 0, function h'^ is well-defined. The equations (1) - (9) ensure 
that preserves the operations. The details are routine. □ 

4 A Characterization 

Ideals and filters of a pomset (or poset) P are defined as usual. Each non-empty 
subset of P is included in a smallest ideal and in a smallest filter, respectively 
called the ideal and the filter generated by the set. An ideal (filter, resp.) gener- 
ated by a singleton set is called a principal ideal {principal filter, resp.). A filter P 
is connected if F is connected as a partial ordered set, equipped with the induced 
partial order. Below each filter and ideal of a pomset P will be considered to be 
a pomset determined by the partial order and labeling inherited from P. 

Theorem 4. Grabowski [9], Valdes, Lawler and Tarjan [17] An A-labeled pom- 
set P belongs to SPa iff P is finite and satisfies the N-condition, i.e., P does 
not have a four-element subposet {u\,U 2 ,u^,U 4 } whose non-trivial order rela- 
tions are given by u\ < M 3 , M 2 < M 3 and M 2 < M 4 . 

The following facts are clear. 

Lemma 1 . Any subpomset determined by a non-empty subset of a pomset sat- 
isfying the N-condition also satisfies this condition. 

Lemma 2. Suppose that a connected pomset P satisfies the N-condition. Then 
any two vertices of P have an upper or a lower bound. 

We also recall 

Theorem 5. Bloom and Esik [4] An A-labeled pomset P belongs to SP A ^ffP 
is (countably) infinite and the following hold: 1. P satisfies the N-condition. 2. 
Each principal ideal of P is finite. 3. Up to isomorphism P has a finite number 
of filters. 



322 



Zoltan Esik and Satoshi Okawa 



The width of a pomset P is the maximum number of pairwise parallel vertices 
of P. If P has infinitely many pairwise incomparable vertices, then its width 
is Lo. (Recall that we only consider non-empty countable pomsets.) Note that 
any pomset satisfying the last condition of Theorem 5 has finite width. The 
second condition is present in much of the literature on event structures. Also, 
if a pomset P satisfies this condition, then every finitely generated ideal of P is 
finite, and each non-empty subset of P contains a minimal element. Moreover, 
we have u < v for two distinct vertices u and v iff there exists a sequence 
u = ui < U 2 < ■ . . < Uk = V such that Ui+i is an immediate successor of Ui, for 
each i £ [fc — 1]. (Of course, vertex w' is an immediate successor oi w if w < w' 
and there exists no z with w < z < w' .) Moreover, each vertex u £ P has a finite 
height n, for some integer n > 0, i.e., there is a longest path from a minimal 
vertex to u, and the length of this path is n. (Minimal vertices have height 0.) 

An uj -branch is a poset which is isomorphic to the poset whose vertices are the 
ordered pairs (f,j), where i is a non-negative integer and j is 0 or 1. Moreover, 
the immediate successors of a vertex {i, 0) are {i -£ 1, 0) and {i, 1), and vertices 
of the form (i, 1) are maximal. Our result is: 

Theorem 6. An A-labeled pomset P is in if and only if it is (countably) 

infinite and the following conditions hold: 1. P satisfies the N-condition. 2. Each 
principal ideal of P is finite. 3. Up to isomorphism P has a finite number of 
connected filters. ). P has no oj-branch. 

In our proof of Theorem 6, we will make use of several observations. The 
following fact depends on our assumption that pomsets are countable. 

Proposition 2. A pomset P £ Pom/ (A) has a linearization to an co-chain iff 
each principal ideal of P is finite. 



Lemma 3. Any filter of an A-labeled pomset satisfying the four conditions of 
Theorem 6 also satisfies these conditions. 



Lemma 4. If P is an A-labeled pomset which has finite width and, up to iso- 
morphism, a finite number of connected filters, and if each principal ideal of P 
is finite, then P has, up to isomorphism, a finite number of filters. 

Proof of Theorem 6. It can be argued by induction on the rank of the pomset 
P £ SP/i U to show that P satisfies all conditions of Theorem 6. To 

prove the other direction, suppose that a countably infinite A-labeled pomset P 
satisfies all four conditions. If P has finite width, then, by Lemma 4, P has, up to 
isomorphism, a finite number of filters. Thus, by Theorem 5, P is generated from 
the singletons by the operations of series and parallel product and series omega 
power. Suppose now that the width of P is w. We argue by induction on the 
number of non-isomorphic connected filters to show that P is in . If P has, 

up to isomorphism, only one connected filter, then this filter P is a principal filter, 
and any connected component of P is isomorphic to F. Moreover, no two parallel 



Series and Parallel Operations on Pomsets 323 



vertices of P have an upper bound, since otherwise P would have a one-generated 
and a two-generated connected filter. Also, no two parallel vertices have a lower 
bound. Indeed, if uq < ui and uq < U2, for some vertices uq,ui,U2 such that ui 
and U2 are parallel, then since the filter generated by ui is isomorphic to the filter 
generated by mq, there are parallel vertices M3 and M4 with u \ < M3 and mi < M4. 
Since no two parallel vertices have an upper bound, it holds that M2 is parallel 
to both M3 and M4. Continuing in this way, there results an w-branch, contrary 
to our assumptions on P. Thus, no two parallel vertices of P have an upper or 
a lower bound. Since P has infinite width, it follows now that P is the parallel 
omega power of a linearly ordered labeled pomset with itself. Since the principal 
ideal generated by any vertex is finite, and since P has, up to isomorphism, a 
single principal filter, it follows that P = tua or P = uja'^ , for some a € A. 

In the induction step, we assume that P has, up to isomorphism, n > 1 con- 
nected filters and that our claim is true for pomsets having, up to isomorphism, 
at most n — 1 connected filters. Note that P cannot be directed.^ Indeed, by 
assumption P has an infinite number of pairwise parallel vertices, say mi, M2, . . .. 
Thus, if P were directed, then for each to > 1 , the vertices mi, M2, . . . , Um would 
generate a connected filter. But any two of these filters are non-isomorphic. It 
follows now that P is either disconnected or eventually disconnected, i.e., there 
exists a least integer k > 0 such that the vertices of height k or more form 
a disconnected pomset. Indeed, if Zi and Z2 do not have an upper bound and 
the height of Z\ is less than or equal to the height of Z2, then by Lemma 2 the 
vertices whose height is at least the height of z\ form a disconnected pomset. 
We prove that when fc > 0 , any vertex of height k is over each vertex of height 
k — 1 . To establish this fact, first we show that if vi and V2 are distinct height k 
vertices which do not have an upper bound, then any height k — 1 vertex mi 
below Ml is also below V2- Indeed, if ui < V2 does not hold, then take a height 
fc — 1 vertex M2 with M2 < M2. By the N-condition, mi and M2 are parallel. Since 
the vertices of height fc — 1 or more form a connected filter, by Lemma 2 mi 
and M2 have an upper bound w. Since mi < mi or M2 < m; does not hold, either 
the vertices mi, M2, vi,w, or the vertices mi, M2, w, V2 form an N, a contradiction. 
Suppose now that mi has height fc and u has height fc — 1 . To show m < mi, let M2 
be a second height fc vertex such that mi and M2 have no upper bound. Such a ver- 
tex exists since the vertices of height fc or more are disconnected. Let mi, M2 have 
height k — 1 , Ui < Vi, i = 1 , 2 . By the preceding argument, M2 < mi and mi < M2. 
To obtain a contradiction, suppose that m < mi does not hold. Then u is distinct 
from Ml, M2. Using Lemma 2 and the fact that the vertices of height fc — 1 or 
more are connected, it follows by the N-condition that there exists an upper 
bound w for u, Mi, M2. Indeed, if mi = M2, then this is immediate from Lemma 2 . 
If Ml 7^ M2, then let Zi be an upper bound for u and m,, i = 1 , 2 . If zi < Z2 
or Z2 < z\ then we are done, let w = Z2 or w = z\, respectively. If z\ and Z2 are 
parallel, then by the N-condition, mi < Z2 and M2 < z\, so that we may again 
let w = z\ or w = Z2- But since u is not below mi and mi or M2 is not below w, 
either ui,u,v\,w or m, M2, w, M2 or m, v\,w, M2 form an N. Since this is impossible, 

® A pomset P is directed if any two elements of P have an upper bound. 



324 



Zoltan Esik and Satoshi Okawa 



we have established the fact that any vertex of height k is over all of the height 
fc — 1 vertices. 

Thus, if fc > 0, then each vertex in S is over any vertex in i?, where R is the 
collection of all vertices of height fc — 1 or less, and S = P — R. The pomset R is 
of finite width. Indeed, if ui, M 2 , . . . were pairwise parallel vertices of R, then an 
infinite family of pairwise non-isomorphic connected filters of P would result by 
taking, for each m > 1, the filter generated by the m vertices mi, . . . , Um- Since 
both the height and the width of R are finite, it follows now by the assumptions 
that R is itself finite, so that R G SP^, since R satisfies the N-condition. (See 
Lemma 1.) By Lemma 3, the pomset S also satisfies the conditions involved in 
the theorem. Clearly, any connected filter of S' is a connected filter of P. Thus, 
if S has, up to isomorphism, n connected filters, then it has a filter isomorphic 
to P. Using this fact it follows now easily that P has an w-branch, contradicting 
our assumptions on P. Thus, S has, up to isomorphism, at most n—1 connected 
filters, so that S G by the induction assumption. Since P = R ■ S, we 

have P € 

Suppose finally that fc = 0, i.e., that P is disconnected. Since any connected 
component of P is a connected filter, it follows that P has, up to isomorphism, a 
finite number of connected components. Thus, since P is countable, we can write 
P = Pi ® . . . < 8 > Pm ® 0 . . . ® Qs^\ for some connected pomsets Pi and Qj. 

By Lemma 3, each of these pomsets satisfies the conditions of the theorem. Also, 
each has at most n connected filters. Thus, by the above argument, it follows 
that each is in so that P is also in □ 

5 Free Algebras, Revisited 

Since the algebras we are dealing with are two-sorted, to get a complete descrip- 
tion of the free algebras in the variety V, we also need to describe the structure 
of the free algebras generated by pairs of sets (A, B) , where A is the set of 
generators of finite sort, and P is a set of generators of infinite sort. 

We will describe the free algebra generated by (A, B) as a subalgebra of 
ojSPyiuBj where without loss of generality we may assume that the sets A and B 
are disjoint. Let wSP^i^s = (SPa, •) ) denote the subalgebra of 

wSPaub generated by the singleton pomsets a, for a G A, and the pomsets 
for b G B. 

Proposition 3. Suppose that P G Then P belongs to ijf the 

following conditions hold: 1. The principal ideal generated by any vertex labeled b, 
for some b G B, is isomorphic to 2. If two parallel vertices have an upper 
bound, then both are labeled in A. 

Another representation of the algebra ojSPa,b can be obtained by allowing 
maximal vertices of a pomset to be labeled by elements of B. Formally, let 
Pom/(A, P) denote the collection of all countable non-empty (A U P)-labeled 
pomsets P with the property that every vertex labeled in P is maximal, and such 



Series and Parallel Operations on Pomsets 325 



that P is either infinite or contains a vertex labeled in B. The set of pomsets 
Pomp (A) was defined at the beginning of Section 2. The operations of series 
and parallel product and series and parallel omega power can be generalized to 
pomsets in Poniir(^) and Pom/(^, B) in a straightforward way, so that we get 
a two-sorted algebra u!Pom{A,B) = (PoniF(^), Pom/(A, i?), •, ). This 

algebra also satisfies the identities (1) - (9), i.e., wPom(A, i?) is in V. 

Proposition 4. The algebra a;Pom(^, B) can be embedded into LuPom.{AU B) . 
The subalgebra of uiPom^A, B) generated by the singleton pomsets corresponding 
to the elements of AU B is isomorphic to the algebra wSP^.s described above. 

In fact, an embedding can be obtained by taking the identity map on the 
finite pomsets PomF(^) and mapping each pomset P G Pom.i{A, B) to the 
pomset Q that results by replacing each vertex of P labeled 6, for some b G B, 
by the pomset 

Theorem 7. The algebra o;SPa,b is freely generated by (A,B) in V. 



Corollary 1. The variety V is generated by either of the following classes of 
algebras. 1. The algebras u;Pom(^) or wPom(^, _B). 2. The algebras wSPyi 
or wSP^ B. 

Corollary 2. The following conditions are equivalent for a sorted equation 
t = t' . 1. t = t' holds in V . 2. t = T holds in all algebras a;Pom(^) or 
uPom^A, B). 3. t = t' holds in all algebras wSPa or wSPa,b- 

6 No Finite Axiomatization 

By the compactness theorem, V has a finite axiomatization iff the equations 
(1) - (9) contain a finite subsystem which forms a complete axiomatization of V. 

Theorem 8. For any finite subset E of the identities (1) - (9) there is a two- 
sorted algebra which is a model of E but fails to satisfy all of the power identities 
(5). Indeed, for any prime p there is a model Cp = (Ep,Ip) which satisfies the 
equations (1) - (4) and (6) - (9) as well as all of the power identities (5) for 
n < p, but such that the identity fails in Cp. Thus V has no finite 

axiomatization . 

Proof. Given a prime p, let Fp be the set of positive integers and let Jp = { 1 , p, T} . 
For all n G Fp, let pin) = 1 if p does not divide n, and let p(n) = p if p divides n. 
Define the operations in Cp as follows: for all a,b G Fp, u,v G Ip, a ■ b = a -\- b, 
a®b = a-\-b, a-u = u, a®u = u®a = u®v = T,a‘^ = p{a), = T, 

y(“) = T, Jt is straightforward to check that the identities (1) - (4) and (6) - 
(9) hold, together with all of the power identities (a")“ = , for integers n not 

divisible by p. However, (P)“ = p'^ = p and 1*^ = 1. □ 



326 



Zoltan Esik and Satoshi Okawa 



7 Complexity 

Suppose that t and t' are two terms in the variables X = {x\^X 2 , . . .} of finite 
sort and Y = {yi, ?/ 2 , ■ • of infinite sort. In order to decide whether the equation 
t = t' holds in V, one needs to construct the pomsets \t\ and \t'\ denoted by t 
and f in the free algebra wSPjc^y, where each variable Xi and yj is evaluated 
by the corresponding singleton pomset. By the freeness of t = t' holds 

in V iff |t| is isomorphic to |t'|. Since these pomsets may be infinite, it is not 
immediately clear that this condition is decidable. 

Theorem 9. There exists a polynomial time algorithm to decide for a given 
equation t\ = ^2 whether t\ = t 2 holds in V. 

The simple proof is based on a polynomial time transformation of terms to 
normal form terms (or rather “normal form trees” ) corresponding to the normal 
representation of the pomsets in wSPyi^s defined in Section 3. Since each term 
has a unique normal form tree, the decision problem reduces to checking whether 
two labeled trees are isomorphic, which can be done in polynomial time, cf. [10]. 



8 Some Further Remarks 

Adding the empty pomset 1 can be done in at least two different ways. First, 
from the geometric point of view, it makes sense to define both omega powers of 
the empty pomset to be the empty pomset. But since the omega powers should 
be of infinite sort, we must add the empty pomset also to the pomsets of infinite 
sort. Moreover, since for each finite P, the pomset P • 1“^ is of infinite sort, each 
finite pomset has to be included in the carrier of pomsets of infinite sort. The 
resulting pomset algebras satisfy the following equations involving the empty 
pomset: 



1 ■ u = u 


(10) 


x ■ 1 = X 


(11) 


U 0 1 = M 


(12) 


y 0 1“ = y 


(13) 




(14) 


=x0l“, 


(15) 



where x is of finite sort, y is of infinite sort, and the sort of u can be both finite 
and infinite. In fact, these equations and the axioms (1) ~ (9) form a basis of 
identities of the above pomset algebras. The free algebras can be described as the 
algebras with the empty pomset and the finite series-parallel A-labeled 

pomsets contained in both carriers. 

From the point of view of processes, both 1“ and 1^“^^ should be interpreted 
as an infinite non-terminating process. Supposing that these processes cannot 
be distinguished, it make sense to define the corresponding pomset algebra as 



Series and Parallel Operations on Pomsets 327 



follows. Let _L be a new symbol which is not in A\J B. Then the pomsets in 
Pom]!“(^, B) are those AU B U {_L}-labeled pomsets satisfying the condition 
that every vertex labeled in B U {_L} is maximal and which are either infi- 
nite or contain a vertex labeled in S U {-L}. In the algebra wPom^(A, i?) = 
(PomF(A), Pomj (A, B),-, (g>, 1,“ ), the operations are defined as before, ex- 
cept that we define 1“ = = _L. The variety Vj_ generated by these algebras 

can be axiomatized by the equations (1) - (9) and (10), (11), (12), (14). The free 
algebra in V± generated by a pair of sets [A, B) can be described as the algebra 
of “generalized series parallel pomsets” in LoVorn.^ {A, B) containing both the 
empty pomset and the pomset T. 

Besides pomsets, there are other structures of interest that satisfy the equa- 
tions (1) - (9). Let A denote a non-empty set, and let A~^ denote the collection 
of all finite non-empty words, and A^ the collection of all w- words over A. 
Moreover, let A* = A~^ U {e}, where e is the empty word, and consider the struc- 
ture La = (P(A+), P(A“), •, ), where P denotes the power set operator, 

and where • is concatenation, 0 is shuffle, “ is omega power and is shuffle 
omega power. Thus, for all K C A~^ and U,V C A~^ U A^, 



K ■ U = {xu : X € K, u S [/} 

U = {uiV\U 2 V 2 . . . : U\U 2 . . . G U, viV 2 . . . GV Ui,Vi G a*, i > 1} 

K'^ = {xiX 2 . . . : Xi G K} 

C/(“) = {U 11 U 21 U 22 U 31 U 32 U 33 . . . : UiiUi 2 . .. GU, My G A*, i > 1}. 

Thus, K‘^ and are in P(A“), while the result obtained by applying a binary 
operation is in P{A‘^) iff one of the two arguments is in this set. (Since both 
P(A+) and P{A^) contain the empty set, distinction should be made whether 
an empty set argument is considered to be a member of P(A^) or a mem- 
ber of P(A“).) It is straightforward to show that La satisfies the equations 
(1) - (9), so that La is in fact in V. (The equations not involving ® and 
define the binoids of [18] and give a complete axiomatization of the equational 
theory of the corresponding language structures involving only concatenation 
and omega power.) However, the language structures La also satisfy equations 
that do not hold in all algebras belonging to V. A simple example is the equa- 
tion x^ <E> x^^'> = x^^\ It is an open problem to find a characterization of the 
equations that hold in all language structures La- 

Dub to our motivation in concurrency, we have not allowed series products 
P Q and series omega powers for infinite P. This restriction was achieved by 
considering two-sorted pomset algebras, making it possible to avoid the heavier 
machinery of partial algebras. Another possibility is to define P ■ Q = P and 
= P, for infinite P. However, in this case, the free algebras and the valid 
equations do not seem to have a nice description. From the mathematical point of 
view, it is also of interest to place no restriction on the applicability of the series 
product and omega power operations. We will address the one-sorted pomset 
models that arise in this way in a forthcoming paper. 



328 



Zoltan Esik and Satoshi Okawa 



References 

1. L. Aceto. Full abstraction for series-parallel pomsets. In: TAPSOFT 91, LNCS 
493, 1-25, Springer- Verlag, 1991. 316, 317 

2. Ch. Baier and M. E. Majster-Cederbaum. Denotational semantics in the cpo and 
metric approach. Theoret. Comput. Sci., 135:171-220, 1994. 316 

3. S. L. Bloom and Z. Esik. Free shuffle algebras in language varieties. Theoret. 
Comput. Sci., 163:55-98, 1996. 316, 317, 317, 317 

4. S. L. Bloom and Z. Esik. Shuffle binoids. Theoret. Inform. AppL, 32:175-198, 1998. 
316, 317, 317, 319, 319, 321 

5. G. Boudol and I. Castellani. Concurrency and atomicity. Theoret. Comput. Sci., 
59:1988, 25-84. 316, 316, 317, 317 

6. B. Bloom and M. Kwiatkowska. Trade-offs in true concurrency: Pomsets and 
Mazurkiewicz traces. In: MFPS 91, LNCS 598, 350-375, Springer- Verlag, 1992. 

316, 317 

7. J. W. de Bakker and J. H. A. Warmerdam. Metric pomset semantics for a concur- 
rent language with recursion. In: Semantics of Systems of Concurrent Processes, 
LNCS 469, 21-49, Springer- Verlag, 1990. 316, 317 

8. J. L. Gischer. The equational theory of pomsets. Theoret. Comput. Sei., 61:199- 
224, 1988. 316, 317 

9. J. Grabowski. On partial languages. Fund. Inform., 4:427-498, 1981. 316, 316, 

317, 317, 317, 319, 321 

10. L. Kucera. Combinatorial Algorithms. Adam Hilger, Bristol and Philadelphia, 
1990. 326 

11. K. Lodaya and P. Weil. Series-parallel posets: algebra, automata and languages. 
In: STAGS' 98, LNCS 1373, 555-565, Springer- Verlag, 1998. 316 

12. A. Mazurkiewicz. Concurrency, modularity and synchronization. In: MFCS 89, 
LNCS 379, 577-598, Springer- Verlag, 1989. 316 

13. J.-J. Ch. Meyer and E. P. de Vink. Pomset semantics for true concurrency with 
synchronization and recursion. In: MFCS 89, LNCS 379, 360-369, 1989. 316 

14. V. Pratt. Modeling concurrency with partial orders. Intemat. J. Parallel Process- 
ing, 15:33-71, 1986. 316, 317, 318 

15. A. Rensink. Algebra and theory of order-deterministic pomsets. Notre Dam J. 
Formal Logic, 37:283-320, 1996. 316, 317 

16. S. T. Tschantz. Languages under concatenation and shuffling. Math. Structures 
Comput. Sci., 4:505-511, 1994. 316, 317 

17. J. Valdes, R. E. Tarjan, and E. L. Lawler. The recognition of series-parallel di- 
graphs. SIAM Journal of Computing, 11(2):298-313, 1982. 316, 317, 321 

18. Th. Wilke. An Eilenberg theorem for oo-languages. In: ICALP 91, LNCS 510, 
588-599, 1991. 327 

19. H. Wimmel and L. Priese. Algebraic characterization of Petri net pomset seman- 
tics. In: CONCUR 97, LNCS 1243, 406-420, Springer- Verlag, 1997. 316 

20. J. Winkowski. Behaviours of concurrent systems. Theoret. Comput. Sci., 12:39-60, 
1980. 316 

21. I. Winkowski. Concatenable weighted pomsets and their applications to modelling 
processes of Petri nets. Fund. Inform., 28:403-421, 1996. 316 

22. G. Winskel. Event structures. In: Petri Nets: Applications and Relationships to 
Other Models of Concurrency, Advances in Petri Nets 1986, Part II, Proceedings 
of an Advanced Course, LNCS 255, 325-392, Springer- Verlag, 1987. 316 



Unreliable Failure Detectors with Limited Scope 
Accuracy and an Application to Consensus 



Achour Mostefaoui and Michel Raynal 

IRISA - Campus de Beaulieu, 35042 Rennes Cedex, France 
{mostef aoui , raynal}® irisa.fr 



Abstract. Let the scope of the accuracy property of an unreliable fail- 
ure detector be the minimum number (k) of processes that may not 
erroneously suspect a correct process to have crashed. Classical failure 
detectors implicitly consider a scope equal to n (the total number of pro- 
cesses). This paper investigates accuracy properties with limited scope, 
thereby giving rise to the Sk and <>Sk classes of failure detectors. 

A reduction protocol transforming any failure detector belonging to Sk 
(resp. OSk) into a failure detector (without limited scope) of the class S 
(resp. OS) is given. This reduction protocol requires f < k, where / is 
the maximum number of process crashes. (This leaves open the problem 
to prove/ disprove that this condition is necessary.) 

Then, the paper studies the consensus problem in asynchronous dis- 
tributed message-passing systems equipped with a failure detector of 
the class OSk- It presents a simple consensus protocol that is explicitly 
based on OSk- This protocol requires / < min(k,n/2). 



1 Introduction 

Several crucial practical problems (such as atomic broadcast and atomic com- 
mit) encountered in the design of reliable applications built on top of unreliable 
asynchronous distributed systems, actually belong to a same family of prob- 
lems, namely, the family of agreement problems- This family of problems can 
be characterized by a single problem, namely the Consensus problem, that is 
their greatest common denominator . That is why the consensus problem is 
considered as a fundamental problem. This is practically and theoretically very 
important. From a practical point of view, this means that any solution to con- 
sensus can be used as a building block on top of which solutions to particular 
agreement problems can be designed. From a theoretical point of view, this 
means that an agreement problem cannot be solved in systems where consensus 
cannot be solved. 

Informally, the consensus problem can be defined in the following way. Each 
process proposes a value and all correct processes have to decide on the same 
value, which has to be one of the proposed values. Solving the consensus problem 
in asynchronous distributed systems where processes may crash is far from being 
a trivial task. It has been shown by Fischer, Lynch and Paterson [4] that there 
is no deterministic solution to the consensus problem in those systems as soon 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 329—341, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



330 Achour Mostefaoui and Michel Raynal 



as processes (even only one) may crash. This impossibility result comes from 
the fact that, due to the uncertainty created by asynchrony and failures, it 
is impossible to distinguish a “slow” process from a crashed process or from 
a process with which communications are very slow. So, to be able to solve 
agreement problems in asynchronous distributed systems, those systems have 
to be “augmented” with additional assumptions that make consensus solvable 
in such improved systems. A major and determining advance in this direction 
has been done by Chandra and Toueg who have proposed the Unreliable Failure 
Detector concept [2]. 

A failure detector can informally be seen as a set of oracles, one per process. 
The failure detector module (oracle) associated with a process provides it with 
a list of processes it guesses to have crashed. A failure detector can make mis- 
takes by not suspecting a crashed process, or by erroneously suspecting a correct 
process. In their seminal paper [2], Chandra and Toueg have defined two types 
of property to characterize classes of failure detectors. A class is defined by a 
Completeness property and an Accuracy property. A completeness property is 
on the actual detection of crashes. The completeness property we are interested 
in basically states that “every crashed process is eventually suspected by every 
correct process” . An accuracy property limits the mistakes a failure detector can 
make. In this paper, we are mainly interested in Weak Accuracy. Such a prop- 
erty basically states that “there is a correct process that is not suspected” . Weak 
accuracy is perpetual if it has to be satisfied from the beginning. It is eventual 
if it is allowed to be satisfied only after some (unknown but finite) time. The 
class of failure detectors satisfying completeness and perpetual (resp. eventual) 
weak accuracy is denoted S (resp. 05). Let n and / denote the total number of 
processes and the maximum number of processes that may crash, respectively. 
5-based consensus protocols have been proposed in [2,7]; they require / < n — 1. 
05-based consensus protocols have been proposed in [2, 6, 7, 9]. they require 
/ < n/2 (which has been shown to be a necessary condition with eventual 
accuracy [2]). Consequently, agreement problems can be solved in asynchronous 
distributed systems augmented with unreliable failure detectors of the classes S 
and OS. 

The (perpetual/eventual) weak accuracy property has actually a scope span- 
ning the whole system: there is a correct process that (from the beginning or 
after some time) is not suspected by the other processes. Here, the important is- 
sue is that the “non-suspicion of a correct process” concerns all other processes. 
In this paper, we investigate failure detector classes whose accuracy property 
has a limited scope: the number of processes that have not to suspect a correct 
process is limited to k (k < n). The parameter k defines the scope of the weak 
accuracy property, thereby giving rise to the classes Sk and OSk of unreliable 
failure detectors (5„ and 05„ corresponding to S and OS, respectively). This 
paper has two aims. The first is to investigate the relation between the scope of 
the weak accuracy property and the maximal number of failures the system can 
suffer. The second is the design of consensus protocols based on failure detectors 
with limited scope accuracy. 



Unreliable Failure Detectors with Limited Scope Accuracy 331 



The paper is composed of five sections. Section 2 introduces the computa- 
tion model and Chandra-Toueg’s failure detectors. Section 3 first defines weak 
accuracy with fc-limited scope and the corresponding classes of failure detectors. 
Then, a reduction protocol that transforms any failure detector belonging to Sk 
(resp. OSk) into a failure detector of the class S (resp. OS), is described. This 
transformation requires f < k. So, it relates the scope of the accuracy property 
to the number of failures. It is important to note that this transformation does 
not require assumptions involving a majority of correct processes. Consequently, 
when f < k, the stacking of a 5-based (or a 05-based) consensus protocol on top 
of the proposed transformation constitutes a solution to the consensus problem 
based on a failure detector with limited scope accuracy. Section 4 investigates 
a “direct” approach to solve the consensus problem on top of an asynchronous 
distributed system equipped with a OSk failure detector. The proposed protocol 
directly relies on OSk and requires / < min{k, n/2). Finally, Section 5 concludes 
the paper. 



2 Asynchronous Distributed Systems and Unreliable 
Failure Detectors 

2.1 Asynchronous Distributed System with Process Crash Failures 

We consider a system consisting of a finite set 77 of n > 1 processes, namely, 
77 = {pi,P 2 , ■ ■ ■ ,Pn}- A process can fail by crashing, i.e., by prematurely halting. 
It behaves correctly {i.e., according to its specification) until it (possibly) crashes. 
By definition, a correct process is a process that does not crash. CORRECT denotes 
the set of correct processes. A faulty process is a process that is not correct. 
As previously indicated, / denotes the maximum number of processes that 
can crash. Processes communicate and synchronize by sending and receiving 
messages through channels. Every pair of processes is connected by a channel. 
Channels are not required to be FIFO, but are assumed to be reliable: they do 
not create, alter or lose messages. There is no assumption about the relative 
speed of processes or message transfer delays. 



2.2 Chaudra-Toueg’s Uureliable Failure Detectors 

Informally, a failure detector consists of a set of modules, each one attached 
to a process: the module attached to pi maintains a set (named suspectedi) of 
processes it currently suspects to have crashed. Any failure detector module is 
inherently unreliable: it can make mistakes by not suspecting a crashed process or 
by erroneously suspecting a correct one. Moreover, suspicions are not necessarily 
stable: a process pj can be added to or removed from a set suspectedi according 
to whether pfs failure detector module currently suspects pj or not. As in other 
papers devoted to failure detectors, we say “process pi suspects process p/' at 
some time t, if at that time we have pj € suspectedi. 



332 Achour Mostefaoui and Michel Raynal 



As indicated in the Introduction, a failure detector class is formally defined 
by two abstract properties, namely a Completeness property and an Aecuraey 
property. In this paper, we consider the following completeness property [2]: 

— Strong Completeness: Eventually, every process that crashes is permanently 
suspected by every correct process. 

Among the accuracy properties defined by Chandra and Toueg [2] we consider 
here the two following ones: 

— Perpetual Weak Accuracy: Some correct process is never suspected. 

— Eventual Weak Accuracy: There is a time after which some correct process is 
never suspected by correct processes. 

Combined with the completeness property, these accuracy properties define 
the following two classes of failure detectors [2] : 

— S: The class of Strong failure detectors. This class contains all the failure 
detectors that satisfy the strong completeness property and the perpetual 
weak accuracy property. 

— OS: The class of Eventually Strong failure detectors. This class contains all 
the failure detectors that satisfy the strong completeness property and the 
eventual weak accuracy property. 

Clearly, S C OS. As indicated in the Introduction, 5-based consensus pro- 
tocols are described in [2,7], and 05-based consensus protocols are described 
in [2, 6, 7, 9]. The 05-based protocols require / < n/2. It has been proved that 
this requirement is necessary [2]. So, all these protocols are optimal with respect 
to the maximum number of crashes they tolerate. 



3 Failure Detectors with fc-Limited Weak Accuracy 

3.1 Definition 

As noted in the Introduction, the weak accuracy property involves all the cor- 
rect processes, and consequently spans the whole system. This “whole system 
spanning” makes the weak accuracy property more difficult to satisfy than if 
it involves only a subset of processes. This observation is the guideline of the 
following definition where the parameter k defines the scope of the accuracy 
property. The fc-accuracy property is satisfied if there exists a set Q of processes 
such that: 

1. \ Q \= k (Scope) 

2. Q n CORRECT yf 0 (At least one correct process) 

3. € Q n CORRECT such that: '^q € Q: q does not suspect p (No suspicion) 

fc-accuracy means that there is a correct process that is not suspected by a set 
of k processes (some of those fc processes may be correct, some others faulty). 
Practically, this means there is a cluster of fc processes, including at least one 



Unreliable Failure Detectors with Limited Scope Accuracy 333 



correct process, whose failure detector modules do not erroneously suspect one 
of them that is correct. It is easy to see that when the scope k is equal to n, 
we get traditional weak accuracy. Perpetual (resp. eventual) weak fc-accuracy 
is satisfied if fc-accuracy is satisfied from the beginning (resp. after some finite 
time). Given a scope parameter (fc), we get the two following a la Chandra-Toueg 
classes of failure detectors: 

— Sk- This class contains all the failure detectors that satisfy strong complete- 
ness and perpetual weak fc-accuracy. 

— OSk- This class contains all the failure detectors that satisfy strong com- 
pleteness and eventual weak fc-accuracy. 



3.2 Prom fc-Accuracy to (Full) Accuracy 

This section describes a protocol (Figure 1) that transforms any failure detector 
of the class Sk (resp. OSk) into a failure detector of the class S (resp. OS). 

Local variables Let a be a set. To ease the presentation, a is represented by an 
array such that a[i] = true means £ G a. Each process pi has a local boolean 
matrix, namely, ksuspecti[l : n, 1 : n], representing a vector of sets. Let us first 
consider the z-th line of this matrix, namely, k^suspecti[i, *]. This line represents 
the set actually provided by the underlying layer: by assumption, it satisfies 
the properties defining Sk (resp. OSk). If k_suspecti[i,£] is true, we say ‘‘^pi k- 
suspects pt” . Let us now consider a line j yf i. The entry k.suspecti[j,£] has 
the value true if, and only if, to pi’s knowledge, pj has not crashed and is 
k-suspecting p£. Finally, the set suspectedi[l : n] is provided by the protocol 
to the upper layer; it satisfies the properties defined by S (resp. 05). When 
fc G suspectedi {i.e., suspectedi[k] = true) we say “pi suspects pfc”. 

Local behavior Processes permanently exchange their k^suspecti[i, *] sets (line 2), 
locally provided by the underlying failure detector. When pi receives such a set 
from Pj (line 3), it first updates its view of pj's k-suspicions (line 4). Then, pi 
examines pj's k-suspicions. More precisely, if pj does not k-suspect p£, then pi 
does not suspect p£ (lines 5-6). If pj k-suspects pg, then pi tests if the number 
of processes it perceives non crashed and k-suspecting pi bypasses the threshold 
n — k (line 7). If this condition is true, pi considers p£ has crashed, and con- 
sequently, adds Pi to suspectedi (line 8), and updates the raw k_suspecti[£^*\ 
(line 9). 

As the proof will show, this reduction protocol assumes / < fc (it is important 
to note that the protocol does not require a “majority of correct processes” 
assumption). This has an interesting practical consequence: when / is small 
{i.e., n), k (the accuracy scope) can be small too (<C n). More precisely, fc is 

not required to be 0(n), but only 0(f). This is particularly attractive to face 
scaling problems^. 

^ Moreover, the protocol can be simplified when one is interested only in the reduction 
from Sk to S. In that case, the weak fc-accuracy property is perpetual, hence, there 



334 Achour Mostefaoui and Michel Raynal 



(1) init: \/{x,£) : k.suspecti[x,i] <— false; : 8uspectedi[t] ^ false; 

(2) repeat forever: do send K_SUSPlClON(A:_SMspecti[*i *]i 0 to pj enddo 

(3) when K_SUSPlClON(fe_SMsp, j) is received: 

(4) Vf : k.suspecti[j,£] ^ k^susp[£]; 

(5) Vf: if ^ {k^uspecti[j,£]) 

(6) then suspectedi[£] <— false 

(7) else if | {x I k^suspecti[x,£]} \> [n — k) 

(8) then suspectedi[£] <— true 

(9) 'iy ■. k^uspecti[£,y\ ^ false 

(10) endif endif 



Fig. 1. From Sk/^Sk to 5/05 



3.3 Proof 

Theorem Let f < k. The protocol described in Figure 1 transforms any failure 
detector G Sk (resp. G OSk) into a failure detector G S (resp. G OS). 

Proof The proof is made of three parts. 

i. Let us first note that (by assumption) the underlying failure detector (whether 
it belongs to Sk or to OSk) satisfies strong completeness. We first show that the 
protocol preserves this property, i.e., if a process pi crashes and if pi is correct, 
then eventually remains permanently in suspectedi (i.e., suspectedi[P\ remains 
true forever). 

Note that there is a time t after which (1) all the faulty processes have 
crashed, (2) all the correct processes permanently k-suspect all the faulty pro- 
cesses, and (3) all k_SUSPICION messages not k-suspecting faulty processes have 
been received. Note that any k_SUSPICION message sent after t includes the set 
of crashed processes (more precisely, if pi has crashed, the corresponding entry 
k.suspecti[i,£] remains true forever). Let us consider a crashed process pg. As 
there are at least n — / correct processes, for each correct process pi, there is a 
time > t after which pi has received (line 3) K_SUSPiCiON messages including pg, 
from at least n — f distinct processes. Due to / < fc, we have n — f > n — k. 
Consequently, pi includes pg in suspectedi (line 8). Moreover, as after t, for any 
correct process pj, k. suspect j [j, £] remains true, and as all messages sent by 
faulty processes have been received, it follows that the test at line 5 will never 
be satisfied and consequently pg will never be suppressed from suspectedi (i.e., 
suspectedi[i] will never be set to false at line 6). 

is a correct process Pu that is never suspected by k processes. This means that, V pi, 
we have | {x \ k.suspecti[x,u]} |< (n — k). It follows that the lines 6 and 9 can be 
suppressed. 



Unreliable Failure Detectors with Limited Scope Accuracy 335 



ii. Let us now assume that the k_suspecti[i,*\ sets satisfy perpetual weak k- 
accuracy (so, the underlying failure detector belongs to Sk)- We show that the 
suspectedi sets satisfy perpetual weak accuracy. 

It follows from the weak fc-accuracy property, that there is a process that 
is never k-suspected by k processes. For to be included in suspectedi (line 8), 
it is necessary (line 7) that pi receives at least (n — k + l) k_SUSPICION messages 
including (i. e., messages such that k.susp[u] = true). As k processes never 
k-suspect this is impossible. 

in. Let us finally assume that the underlying failure detector satisfies eventual 
weak fc-accuracy (so, it belongs to OSk). We show that the suspectedi sets satisfy 
eventual weak accuracy. 

Let Pu be the correct process that, after some time t, is never k-suspected by 
a set Q oi k processes (p„ G Q). We first show that eventually, for any correct 
process pi, we have the following relation | {a; | k-suspecti[x,u]} \< (n — k). Let 
us consider Pz & Q. 

— Case: Pz is correct. There is a time tlz after which pi will receive from pz 
only K_SUSPiCiON messages not including (i.e., messages from pz such 
that k.susp[u] is equal to false). Due to line 4, it follows that the predicate 
k_suspecti[z,u] = false eventually holds forever. 

— Case: Pz is faulty. Let us first observe that there is a time after which pi 
will no longer receive messages from pz. From this time, the update of 
k^suspecti[z, m] (line 4) is no more executed. Moreover, due the completeness 
property (point i), there is a time t2z after which all processes permanently 
suspect Pz. After t2z, each time pi adds Pz to suspectedi (line 8), it also 
updates k.suspecti[z, u] to false (line 9). It follows that there is a time after 
which the predicate ksuspecti[z,u\ = false holds forever. 

Thus, as I Q 1= k, there is a time after which, for any correct process pi, \ {a; | 
k_suspecti[x, u]} \< n — k holds forever. Consequently, as far as Pu is concerned, 
the test of line 7 will always be false after that time, and will never be again 
added to suspectedi. 

Finally, we show that, if it belongs to suspectedi, Pu will be withdrawn 
from this set. Note that (by assumption) Q contains only processes that af- 
ter some time stop suspecting p„. From the assumption / < k, we conclude 
that, after some time, Q contains a correct process pj that stops suspecting 
So, for each correct process pi, there is a time after which pi receives from pj 
only K^USPICION messages not including {i.e., messages from pj such that 
k_susp[u] is always false). Then, when such a message is received from pj, due to 
lines 4-6, it follows that if is included in suspectedi, it is definitely suppressed 
from it. 

It follows from the previous discussion that there is a time after which there is 
a correct process Pu such that the boolean suspectedi [u] remains always false for 
any correct process pi. Hence, the suspectedi sets satisfy eventual weak accuracy. 



336 Achour Mostefaoui and Michel Raynal 



3.4 An Open Problem 

The previous transformation leaves open the problem to prove/disprove that 
the condition / < fc is necessary. We conjecture this condition is necessary. If it 
is, fc — 1 is an upper bound for the number of processes that may crash when 
reducing Sk (resp. OSk) to S (resp. OS) ^ . Let us also note that this constraint 
is reminiscent of the impossibilty constraint attached to the A:-set agreement 
problem [1,3]. 

4 A o«Sfc-Based Consensus Protocol 

This section is focused on the design of protocols solving the consensus prob- 
lem in asynchronous message-passing distributed systems equipped with failure 
detectors with limited scope weak accuracy. As noted in the Introduction, a 
solution can be obtained by stacking a 5/ 05-based consensus protocol on top 
of the previous transformation (Figure 1). Another approach consists in design- 
ing a consensus protocol that directly uses a failure detector with limited scope 
accuracy. The rest of this section presents such a 05fe-based consensus protocol. 

4.1 The Consensus Problem 

In the Consensus problem, every correct process pi proposes a value Vi and all 
correct processes have to decide on some value v, in relation to the set of proposed 
values. More precisely, the Consensus problem is defined by the following three 
properties [2,4]: 

— Termination: Every correct process eventually decides on some value. 

— Validity: If a process decides v, then v was proposed by some process. 

— Agreement: No two correct processes decide differently. 

The agreement property applies only to correct processes. So, it is possible that 
a process decides on a distinct value just before crashing. Uniform Consensus 
prevents such a possibility. It has the same Termination and Validity properties 
plus the following agreement property: 

— Uniform Agreement: No two processes (correct or not) decide differently. 

In the following we are interested in the Uniform Consensus problem. 

4.2 Underlying Principles 

As in other failure detector-based consensus protocols [2, 6, 7, 9], each process pi 
manages a local variable esti which contains its current estimate of the decision 
value. Initially, esti is set to Vi, the value proposed by pi. Processes proceed in 
consecutive asynchronous rounds. (In the following, the line numbers implicitly 
refer to Figure 2). 

^ A protocol transforming failure detectors with limited accuracy into failure detectors 
with full accuracy is described in [5]. This protocol assumes a minority of crashes 
and a majority scope, i.e., it works only when / < n/2 < k. (Moreover, this protocol 
does not work in all failure patterns when it reduces OSk into OS). 



Unreliable Failure Detectors with Limited Scope Accuracy 337 



Round coordination. Each round r (initially, for each process pi, Vi = 0) is 
managed by a predetermined set ck (current kernel) of k processes. The ck set 
associated with round r is defined by coord(r) (line 3). This function is defined 
as follows. Its domain is the set of natural integers, its codomain is the finite 
set made of all the subsets of k distinct processes, and it realizes the following 
mapping: when r ranges the increasing sequence of natural numbers, each set of 
exactly k processes is infinitely often output by coord. 

During round r, the k processes of coord(r) act a coordinator role for the 
round. Due to the definition of coord, any set of k processes will repeatedly 
coordinate rounds. Each round is made of two phases. The aim of the first phase 
is to provide processes with a single value v coming from ck (the current kernel) 
or with a default value T (when the current kernel does not provide a single 
value). Then, the aim of the second phase is to decide a value v provided by ck, 
when a process has received only this value during this second phase. 

The underlying principles of the protocol are close to the ones used in [7] 
(where each kernel is made of a single process). 

First phase of a round. To realize its aim, the first phase (lines 4-14) is made of 
two communication steps. During the first step (lines 4-8), the k processes of the 
current kernel ck execute a sub-protocol solving an instance of a problem we call 
TWA (Terminating Weak Agreement) . This problem is defined in the context of 
asynchronous systems equipped with failure detectors. Each process proposes a 
value, and correct processes have to decide (we say “a process T WA-decides” ) a 
value in such a way that the following properties be satisfied: 

~ TWA-Termination: Every correct process eventually TWA-decides on some 
value. 

— TWA-Validity: If a process TWA-decides v, then v was proposed by some 
process. 

— TWA-Agreement: If there is a process that is not suspected by the other 
processes, then no two processes TWA-decide differently. 

TWA is close to but weaker than consensus. A TWA protocol has to satisfy the 
consensus agreement property only when there is a correct process that is not 
suspected by the other processes^. 

So, first, the processes € ck solve an instance of the TWA problem (line 6). 
This instance is identified by the current round number. A process pi € ck starts 
its participation to a TWA protocol by invoking TWA_k_propose(ri, esti) and 

® Actually, the 5-based consensus protocol described in [2] solves the TWA problem. 
More precisely, if the underlying failure detector satisfies the properties defining S, 
this protocol ensures that the correct processes decide the same (proposed) value. 
If the underlying failure detector does not satisfy the properties defining S, the 
correct processes decide on possibly different (proposed) values. When this protocol 
is used to solve instances of the TWA problem in the context of the 5fc/05fc-based 
consensus protocol, each of its execution involves a particular set of k processes out 
of n, defined by coord (r). 



338 Achour Mostefaoui and Michel Raynal 



terminates it by invoking TWA_k_decide(ri) that provides it with a TWA-decided 
value. Then each process S ck broadcasts (line 7) the value it has TWA-decided, 
and each process ^ ck waits until it has received a TWA-decided value, i.e., a 
value sent by a process G ck (line 4). After this exchange, each process has a 
value (kept in phl.est) that has been TWA-decided by a process G ck. 

The second step of the first round (lines 9-14) starts with each process ^ ck 
broadcasting the value it has received. So, each process pj has broadcast (either 
at line 7 or at line 9) a value (est.phj) TWA-decided by a process G ck. Then, 
each process pi waits until it has received phl^est values from a majority of pro- 
cesses. If all those values are equal (say to u), pi adopts v (line 12). Otherwise, pt 
adopts the default value T (line 13). A process pi keeps the value (u/T) it has 
adopted in the local variable ph2^esti. 

The aim of the first phase of a round is actually to ensure that the value of 
each ph2_esti local variable is equal either to a same value v TWA-decided by a 
process of cfc, or to T. 

Second phase of a round. The aim of the second phase is to force processes 
to decide, and for those that cannot decide, to ensure the consensus agreement 
property will not be violated. 

This is done in a way similar to the second step of the first phase. Pro- 
cesses exchange their ph2_esti values. When a process pi has received such val- 
ues from a majority set (line 16), it considers the set of received values, namely, 
ph2jreci. According to the value of this set, pi either progresses to the next 
round (case ph2jreci = {T}, line 18), or decides (case ph2jreci = {u}, line 19), 
or considers v as its new estimate esti and progresses to the next round (case 
ph2jreci = {u, T}, line 20). As ph2jreci contains values received from a majority 
set, if ph2jreCi = {u}, then it follows that ^ Pj ■ v G ph2jrecj. Combined with 
lines 19-20, this guarantees that if a value v has been decided, then all estimates 
are then equal to v, and consequently, no other value can be decided in future 
rounds. 

4.3 The Protocol 

The protocol is formally described in Figure 2. A process pi starts a Consen- 
sus execution by invoking Consensus(ui). It terminates it when it executes the 
statement return which provides it with the decided value (lines 19 and 23). 

It is possible that distinct processes do not decide during the same round. To 
prevent a process from blocking forever (i.e., waiting for a value from a process 
that has already decided), a process that decides, uses a Reliable Broadcast to 
disseminate its decision value (similarly as protocols described in [2,6,9]). To 
this end the Consensus function is made of two tasks, namely, T1 and T2. T1 
implements the computation of the decision value. Line 19 and T2 implement 
the reliable broadcast. 

Due to space limitation, the proof of the Validity, Agreement and Termination 
properties are omitted. The interested reader will find them in [8] (They are two- 
pages long). The Termination proof requires f < k and / < n/2. The Agreement 



Unreliable Failure Detectors with Limited Scope Accuracy 339 



Function Consensus(i)i) 
cobegin 

(1) task Tl: ri ^ 0; esU <— Ui; % Di 7 ^ _L % 

(2) while true do % Sequence of asynchronous rounds % 

(3) n ^ ri + 1; cki <— coord(ri);% |cfci| = k% 



% Phase 1 of round rt (two steps): from n proposals to < two values % 



% Phase 1 of round Vi. Step 1: from n proposals to 1 < 7 )^:proposals < k % 

(4) case (i ^ cki) do wait (PHl_EST(ri,phl_est) received from any p £ cfci); 

(5) phl.esti <— phl.est 

( 6 ) (i £ cki) do TWA_k_propose(ri, esti); phl^esti <— TWA_k_decide(ri); 

(7) Vj : send PHl^ST{ri,phl.esti) to Pj 

(8) endcase; 

% If A: = 1 {i.e., f = 0): early stopping: execute return(p/il_esti) % 

% Phase 1 of round ri. Step 2: from < k proposals to two values % 

(9) if (i ^ cki) then Vj : send PHl_EST(ri,p/il_esti) to Pj endif; 

(10) wait (PHl_EST(ri, p/il_est) received from [(n + l)/2] processes); 

(11) let phljreci = { phl.est \ PHl_EST(ri,p/il_est) received at line 4 or 10 }; 

% 1 < \phljreci\ < min{k, [(n + l)/ 2 ]) % 

(12) case (|p/il_reci| = 1) do ph2.esti ^ v where {u} = phljrcd 

(13) {\phljreci\ > 1) do ph2.esti ^ _L (default value) 

(14) endcase; 



% Phase 2 of round rt (a single step): 

try to converge from < two values to a single decision % 



(15) Vj : send PH2_EST(ri,ph2_esti) to Pj; 

(16) wait (PH2_EST(ri, p/i2_est) received from [(n + l)/2) processes); 

(17) let ph2jreci= { ph2.est \ PH2_EST(ri,ph2_est) received at line 16 }; 

% ph2jreci = {_L} or {«} or {u, _L} where u 7 ^ _L % 

(18) case [ph2jreci = {-L}) do skip 

(19) {ph2jrcci = {u}) do Vj 7 ^ i : send decide(u) topj\ return(u) 

(20) {ph2jrcci = {u, _L})do csU <— v 

( 21 ) endcase 

( 22 ) endwhile 

(23) task T2: upon reception of decide(u): 

(24) 'ij ■. send decide(i;) to Pj\ return(u) 
coend 



Fig. 2. A OiSfc-Based Consensus Protocol (/ < min{k,n/2)) 



340 Achour Mostefaoui and Michel Raynal 



proof requires / < n/2. So, the protocol requires / < min{k,n/2). Note that 
/ < n/2 is a requirement necessary to solve consensus in an asynchronous dis- 
tributed ssytem equipped with a failure detector that provides only eventual 
accuracy [2]. The constraint / < fc is due to the use of a failure detector with 
limited scope accuracy. 



5 Conclusion 

This paper has investigated unreliable failure detectors with limited scope ac- 
curacy. Such a scope is defined as the number k of processes that have not to 
suspect a correct process. Classical failure detectors implicitly consider a scope 
equal to n (the total number of processes) . A reduction protocol transforming any 
failure detector belonging to Sk (resp. OSk) into a failure detector (without lim- 
ited scope) of the class S (resp. OS) has been presented. This reduction protocol 
requires f < k (where / is the maximum number of process crashes). Then, the 
paper has studied solutions to the consensus problem in asynchronous distributed 
message-passing systems equipped with failure detectors of the class OSk ■ A sim- 
ple OiSfc-based consensus protocol has been presented. It has been shown that 
this protocol requires / < min(fc,n/2). 



References 

1. Borowsky E. and Gafni E., Generalized FLP Impossibility Results for t-Resilient 
Asynchronous Gomputations. Proc. 25th ACM Symposium on Theory of Compu- 
tation, Galifornia (USA), pp. 91-100, 1993. 336 

2. Ghandra T. and Toueg S., Unreliable Failure Detectors for Reliable Distributed 

Systems. Journal of the ACM, 43(2):225-267, March 1996. 330, 330, 330, 330, 

330, 332, 332, 332, 332, 332, 332, 336, 336, 337, 338, 340 

3. Ghaudhuri S., Agreement is Harder than Consensus: Set Consensus Problems in 
Totally Asynchronous Systems. Proc. 9th ACM Symposium on Principles of Dis- 
tributed Computing, Quebec (Canada), pp. 311-324, 1990. 336 

4. Fischer M.J., Lynch N. and Paterson M.S., Impossibility of Distributed Consensus 

with One Faulty Process. Journal of the ACM, 32(2):374-382, April 1985. 329, 

336 

5. Guerraoui R. and Schiper A., U-Accurate Failure Detectors. Proc. 10th Workshop 
on Distributed Algorithms (WDAG’96), Bologna (Italy), Springer Verlag LNCS 
#1151, pp. 269-285, 1996. 336 

6. Hurfin M. and Raynal M., A Simple and Fast Asynchronous Consensus Protocol 
Based on a Weak Failure Detector. Distributed Computing, 12(4), 1999. 330, 332, 
336, 338 

7. Mostefaoui A. and Raynal M., Solving Consensus Using Chandra- Toueg’s Unreli- 

able Failure Detectors: a General Quorum-Based Approach. Proc. 13th Symposium 
on Distributed Computing (DISC’99), (Formerly WDAC), Bratislava (Slovakia), 
Springer Verlag LNCS #1693 (P. Jayanti Ed.), September 1999. 330, 330, 332, 

332, 336, 337 



Unreliable Failure Detectors with Limited Scope Accuracy 341 



8. Mostefaoui A. and Raynal M., Unreliable Failure Detectors with Limited Scope 
Accuracy and an Application to Consensus. Tech. Report #1255, IRISA, Universite 
de Rennes, France, July 1999, 15 pages. 338 

9. Schiper A., Early Consensus in an Asynchronous System with a Weak Failure 
Detector. Distributed Computing, 10:149-157, 1997. 330, 332, 336, 338 



Graph Isomorphism: Its Complexity and 
Algorithms 



Seinosuke Toda 

Dept. Applied Mathematics, College of Humanities and Sciences, Nihon University, 
3-25-40 Sakurajyousui, Setagaya-ku, Tokyo 156, JAPAN 

It seems to be widely believed that the Graph Isomorphism problem in gen- 
eral case would not be NP-complete . The results that the problem is low for 
and for PP have supported this belief. Furthermore, it is also known that the 
problem is polynomial-time equivalent to its counting problem. This result also 
supported the belief since it is conversely believed that any NP-complete prob- 
lem would not have this property. On the other hand, it is unknown whether the 
problem can be solved in deterministic polynomial-time. From these situations, 
it is widely believed that the problem seems to have an intermediate complexity 
between deterministic polynomial-time and NP-completeness. In this talk, I will 
first give a breif survey on a current status of the computational complexity of 
Graph Isomorphism problem. 

When restricting the class of graphs to be dealt with, there are many 
polynomial-time algorithms on the problem. For instance, it has been shown 
that the problem can be solved in linear time for the class of trees, the class of 
interval graphs, and in polynomial-time for the class of graphs with bounded de- 
gree and for the class of partial fc-trees. Related to these results, I will present a 
recent work co-operated with T. Nagoya and S.Tani that the problem of counting 
the number of isomorphisms between two partial k-trees can be solved in time 
From this, we observe that the Graph Isomorphism itself can be also 
solved in time for partial A:-trees. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 341—341, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



Computing with Restricted Nondeterminism: 
The Dependence of the OBDD Size on the 
Number of Nondeterministic Variables 



Martin SauerhofT* 

FB Informatik, LS 2, Univ. Dortmund, 44221 Dortmund, Germany 
sauerhoff @ls2 . cs.uni-dortmund.de 



Abstract. It is well-known that an arbitrary nondeterministic Turing 
machine can be simulated with polynomial overhead by a so-called guess- 
and-verify machine. It is an open question whether an analogous simula- 
tion exists in the context of space-bounded computation. In this paper, 
a negative answer to this question is given for nondeterministic OBDDs. 
If we require that all nondeterministic variables are tested at the top of 
the OBDD, i. e., at the beginning of the computation, this may blow-up 
the size exponentially. 

This is a consequence of the following main result of the paper. There 
is a sequence of Boolean functions /„ : {0, 1}” — > {0, 1} such that /„ 
has nondeterministic OBDDs of polynomial size with log n) non- 

deterministic variables, but /„ requires exponential size if only at most 
0(log n) nondeterministic variables may be used. 



1 Introduction and Definitions 

So far, there are only few models of computation for which it has been possible 
to analyze the power of nondeterminism and randomness. Apart from the ob- 
vious question whether or not nondeterminism or randomization helps at all to 
decrease the complexity of problems, we may also be interested in the following, 
more sophisticated questions: 

• How much nondeterminism or randomness is required to exploit the full power 
of the respective model of computation? Is there a general upper bound on 
the amount of these resources which we can make use of? 

• How does the complexity of concrete problems depend on the amount of 
available nondeterminism or randomness? 

For Turing machines, we even do not know whether nondeterminism or random- 
ness helps at all to solve more problems in polynomial time, and we seem to 
be far away from answers to questions of the above type. On the other hand, 
nondeterminism and randomness are well-understood, e. g., in the context of 
two-party communication complexity. It is a challenging task to fill the gap in 
our knowledge between the latter model and the more “complicated” ones. 

• This work has been supported by DFG grant We 1066/8-2. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 342—355, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Computing with Restricted Nondeterminism 



343 



In this paper, we consider the scenario of space-bounded computation of Boolean 
functions. Besides Boolean circuits and formulae, branching programs are one of 
the standard representations for Boolean functions. 

Definition 1. 

(1) A (deterministic) branching program (BP) G on the set of input vari- 
ables Xn = {si, . . . , a;„} is a directed acyclic graph with one source and 
two sinks, the latter labeled by the constants 0 and 1, resp. Interior nodes 
are labeled by variables from X„ and have two outgoing edges labeled by 0 
and 1, resp. This graph represents a function fn- {0,1}" — > {0,1} on Xn 
as follows. In order to evaluate fn for a given assignment a € {0, 1}" of 
the input variables, one follows a path starting at the source. At an interior 
node labeled by Xi, the path continues with the edge labeled by ai. The output 
for a is the label of the sink reached in this way. 

(2) A nondeterministic branching program is syntactically a deterministic 
branching program on “usual” variables from Xn and some “special” vari- 
ables from Yn = {yi, ■ ■ . ,yr{n)}> Called nondeterministic variables. Nodes 
labeled by the latter variables are called nondeterministic nodes. On each 
path from the source to one of the sinks, each nondeterministic variable is 
allowed to appear at most once. The nondeterministic branching program 
computes 1 on an assignment a to the variables in Xn iff there is an assign- 
ment b to the nondeterministic variables such that the 1-sink is reached for 
the path belonging to the complete assignment consisting of a and b. Such a 
path to the 1-sink is called accepting path. 

(3) The size of a deterministic or nondeterministic branching program G, |G|, 
is the number of nodes in G. The deterministic (nondeterministic) branch- 
ing program size of a function /„ is the minimum size of a deterministic 
(nondeterministic, resp.) branching program representing fn. 

For a history of results on branching programs, we have to refer to the literature, 
e. g., the monograph of Wegener [15] (and the forthcoming new monograph [16]). 

It is a fundamental open problem to prove superpolynomial lower bounds on 
the size of branching programs for explicitly defined Boolean functions even in 
the deterministic case. The nondeterministic case seems to be still harder (see, 
e.g., the survey of Razborov [12]). Nevertheless, several interesting restricted 
types of branching programs could be analyzed quite successfully, and for some 
of these models even exponential lower bounds could be proven. The goal in 
complexity theory is to provide proof techniques for more and more general 
types of branching programs, and on the other hand, to extend these techniques 
also to the nondeterministic or randomized case. 

Here we deal with OBDDs (ordered binary decision diagrams) which are 
one of the restricted types of branching programs whose structure is especially 
well-understood . 

Definition 2. Let Xn = {a^i, . . . , Xn}, and let tt: {1, . . . , n} ^ Xn describe a 
permutation of Xn . A tt-OBDD on Xn is a branching program with the property 
that for each path from the source to one of the sinks, the list of variables on this 



344 



Martin Sauerhoff 



path is a sublist o/7t(1), . . . ,7r(n). We call a graph an OBDD if it is a tt-OBDD 
for some permutation tt. The permutation tt is called the variable ordering of 
the OBDD. 

OBDDs have been invented as a data structure for the representation of Boolean 
functions and have proven to be very useful in various fields of application. For 
many applications it is crucial that one can work with OBDDs of small size for 
the functions which have to be represented. Hence, lower and upper bounds on 
the size of OBDDs are also of practical relevance. We consider the following two 
nondeterministic variants of OBDDs: 

Definition 3. 

(1) Let G he a nondeterministic branching program on variables from Xn U Yn, 
where Yn contains the nondeterministic variables, and let tt be a permutation 
of the variables in Xn- We call G a nondeterministic tt-OBDD if the order 
of the Xn-variables on all paths from the source to one of the sinks in G is 
consistent with tt. 

(2) If there is even a permutation tt' on all variables Xn U Yn such that G 
is syntactically a tt'-OBDD (according to Definition 2), then we call G a 
synchronous nondeterministic tt'-OBDD. 

Synchronous nondeterministic OBDDs have the desirable property that they 
may be manipulated using the well-known efficient algorithms for deterministic 
OBDDs. On the other hand, the more general definition (1) is the natural one 
for complexity theory, since it allows OBDDs to use nondeterminism in the 
same way as done by Turing machines. (Observe that the time steps when a 
nondeterministic Turing machine may access its advice bits need not be fixed in 
advance for all computations.) 

We review some known facts on the size complexity of the different variants of 
OBDDs. There is a well-understood technique which allows a straightforward ap- 
plication of results on one-way communication complexity to prove lower bounds 
on the OBDD size (we will describe this formally in the next section) . This tech- 
nique works in the deterministic as well as the nondeterministic and randomized 
case and has yielded several exponential lower bounds on the size of OBDDs 
(see, e. g., [1,2,3,7,8,13,14]). Having understood how simple (exponential) lower 
bounds on the size of OBDDs can be proven, we can try to analyze the de- 
pendence of the size on the resources nondeterminism and randomness in more 
detail. 

The dependence of the OBDD size on the resource randomness has been dealt 
with to some extent in [13]. Analogous to a well-known result of Newman [11] for 
randomized communication complexity, it could be shown that 0(log n) random 
bits (where n is the input length) are always sufficient to exploit the full power 
of randomness for OBDDs. On the other hand, there is also an example of a 
function for which this amount of randomness is necessary to have randomized 
OBDDs of polynomial size (see [13] for details). 

What do we know about the resource nondeterminism? In the context of 
communication complexity theory, Hromkovic and Schnitger [6] have proven 



Computing with Restricted Nondeterminism 



345 



that, contrary to the randomized case, the number of nondeterministic advice 
bits cannot be bounded by 0(log n) without restricting the computational power 
of the model. An asymptotically exact tradeoff between one-way communication 
complexity and the number of advice bits has been proven by Hromkovic and 
the author [-S]. 

These results lead to the conjecture that there should be sequences of func- 
tions which have nondeterministic OBDDs of polynomial size, but which require 
exponential size if the amount of nondeterminism, measured in the number of 
nondeterministic variables, is limited to O(logn), where n is the input length. 
But although lower bounds on the size of nondeterministic OBDDs can imme- 
diately be obtained by using the results on one-way communication complexity, 
the upper bounds do not carry over. Proving a tradeoff between size and non- 
determinism turns out to be a difficult task even for OBDDs. 

In this paper, we present a sequence of functions /„: {0, 1}" — > {0, 1} with 
the following properties: 

• fn has synchronous nondeterministic OBDDs of polynomial size which 
use r(n) • (1 -|- o(l)) nondeterministic variables, where r(n) := (1/3) • 
(n/3)^/^logn; 

• fn requires exponential size in the synchronous model if at most (1 — e) • r(n) 
nondeterministic variables may be used; £ > 0 an arbitrarily small constant; 
and 

• fn has exponential size even for general nondeterministic OBDDs if the num- 
ber of nondeterministic variables is limited to O(logn). 

The rest of the paper is organized as follows. In Section 2, we present some tools 
from communication complexity theory. In Section 3, the main results are stated 
and proven. We conclude the paper with a discussion of the results and an open 
problem. 

2 Tools from Communication Complexity Theory 

In this section, we define deterministic and nondeterministic communication 
protocols and state two lemmas required for the proof of the main theorem of 
the paper. For a thorough introduction to communication complexity theory, we 
refer to the monographs of Hromkovic [4] and Kushilevitz and Nisan [10]. 

A deterministic two-party communication protocol is an algorithm by which 
two players, called Alice and Bob, cooperatively evaluate a function / : X xY 
{0,1}, where X and Y are finite sets. Alice obtains an input x G X and Bob 
an input y G Y. The players determine f{x,y) by sending messages to each 
other. Each player is assumed to have unlimited (but deterministic) computa- 
tional power to compute her (his) messages. The (deterministic) communication 
complexity of f is the minimal number of bits exchanged by a communication 
protocol by which Alice and Bob compute f{x, y) for each input {x, y) G X xY. 

Here we only consider protocols with one round of communication, so-called 
one-way communication protocols. We use D^^^{f) to denote the (determinis- 



346 



Martin Sauerhoff 



tic) one-way communication complexity of f, by which we mean the minimum 
number of bits sent by Alice in a deterministic one-way protocol for /. 

Furthermore, we also have to deal with the nondeterministic mode of com- 
putation for protocols. We directly consider the special case of nondeterministic 
one-way protocols. 

Definition 4. A nondeterministic one-way communication protocol for f : X x 
Y {0, 1} is a collection of deterministic one-way protocols P\, . . . ,Pd, where 
d = 2’', with f{x, y) = 1 ijf there is an i G {1, . . .,d} such that Pi{x, y) = 1. The 
number r is called the number of advice bits of P. 

The number of witnesses of P for an input {x,y), accp{x,y), is defined by 
&ccp{x,y) := \{i \ Pi{x,y) = 1\\. Furthermore, let the nondeterministic com- 
plexity of P be defined by N(P) := r -|- maxi<i<d Z?(Pi), where D{Pi) denotes 
the deterministic complexity of Pi . 

The nondeterministic one-way complexity of f , N^^^{f) , is the minimum 
of N{P) over all protocols P as described above. By the nondeterministic one- 
way complexity of / with restriction to r advice bits and w witnesses for 1-inputs, 
we mean the minimum complexity of a nondeterministic protocol P 
for f which uses r advice bits and which fulfills accp{x,y) > w for all (x,y) G 
Finally, we define := 

The following well-known function plays a central role in this paper. 

Definition 5. Let |a| denote the value of an arbitrary Boolean vector a as the 
binary representation of a number. Define IND„ : {0, 1}" x {0, — > {0, 1} 
on inputs x = {xi, ...,x„) and y = (j/i, . . . ,?/|-iogn]) by IND„(cc,?/) := x\yi+i, if 
\y\ G {0, . . . , n — 1}, and IND„(x, y) := 0, otherwise. 

This function is referred to as “index” or “pointer function” in the literature. 
We may interpret it as the description of direct storage access: The x- vector 
plays the role of the “memory contents,” and the y-vector is an “address” in the 
memory. 

Kremer, Nisan, and Ron [9] have shown that IND„ has complexity I7(n) for 
randomized one-way communication protocols with bounded error. It is easy to 
see that essentially log n bits of communication are sufficient and also necessary 
to compute IND„ by nondeterministic one-way protocols. 

Here we will require the following more precise lower bound on the nonde- 
terministic one-way communication complexity of IND„ which also takes the 
number of advice bits and the number of witnesses for 1-inputs into account: 

Theorem 1. > nw ■ 2“”“^ -|- r. 

This is essentially a special case of a result derived by Hromkovic and the au- 
thor [5] for a more complicated function. Because Theorem 1 is an important 
ingredient in the proof of our new result presented here, we give an outline of 
the proof. 



Computing with Restricted Nondeterminism 



347 



Sketch of Proof. Let P be a nondeterministic one-way protocol for IND„ with r 
advice bits and accp(a;, y) > w for all {x, y) € IND“^(1). Hence, there are d = 2’' 
deterministic one-way protocols Pi , . . . , which compute / as described in 
Definition 4. For i = 1, . . . , d, let be the function computed by Pp Obviously, 
it holds that gi < IND„. 

By a simple counting argument (used in the same way by Hromkovic and 
Schnitger in [6] and originally due to Yao [17]), we can conclude that there is an 
index io € {1, . . . , d} such that |gj~^(l)| > w\ IND“^(l)|/d. It is easy to see that 
I IND-i(l)| = n ■ 2^-\ hence, |5”^(1)| > nw ■ 

It now remains to prove a lower bound on the deterministic one-way com- 
munication complexity of an arbitrary function g with g < IND„ in terms 
of the number of 1-inputs of this function g. More precisely, we claim that 
> |g“^(l)|/2”. From this, the theorem follows. 

For the proof of this lower bound on the deterministic one-way communi- 
cation complexity, we consider the communication matrix Mg of g, which is 
the 2” X 2 r*°s "1 -matrix with 0- and 1-entries defined by Mg{x,y) := g{x,y) 
for X G {0,1}” and y G {0, . It is a well-known fact that = 

[log(nrows(Mg)] , where nrows(Mg) is the number of different rows of Mg. 

Hence, it is sufficient to prove that log(nrows(Mg)) > |g“^(l)|/2”. The key 
observation to prove this is that a vector a G {0, 1}” with k ones can occur in 
at most 2”“^ rows of Mg since g < IND„. Thus, we need many different rows 
in Mg if we want to have many 1-inputs for g. The proof can be completed by 
turning the last statement into a precise numerical estimate (this part is omitted 
here) . □ 

Finally, we briefly introduce the well-known standard technique for proving 
lower bounds on the OBDD size in form of a technical lemma. In the following, 
the technique is called reduction technique for easier reference. 

The reduction technique has appeared in various disguises in several papers. We 
use the formalism from [14] for our description. This makes use of the following 
standard reducibility concept from communication complexity theory. 

Definition 6 (Rectangular reduction). LetXf,Yf and Xg,Yg be finite sets. 
Let f : XfXYf {0,1} andg: XgXYg {0,1} be arbitrary functions. Then we 
call a pair (ipi,ip 2 ) of functions (f\: Xf —f Xg and Lp 2 '. Yf ^ Yg a rectangular 
reduction from f to g if g{tpi{x),ip 2 {y)) = f {x,y) for all (x,y) G Xf x Yf. If 
such a pair of functions exists for f and g, we say that f is reducible to g. 

We describe the reduction technique only in the special case which is important 
for this paper. The technique works for the general definition of nondeterministic 
OBDDs. 

Lemma 1 (Reduction Technique). Let the function g: {0,1}” ^ {0,1} be 
defined on the variable set X = {ii, . . . , a;„}. Let n be a variable ordering on 
X. W. 1. o. g. (by renumbering) we may assume that tt is described by xi, . . . ,Xn- 

Assume that there is a function f'.U x V {0,1}, where U and V are 
finite sets, and a parameter p with 1 < p < n — 1 such that f is reducible to 



348 



Martin Sauerhoff 



g: {0,1}^ X {0,1}"“^ ^ {0,1}. Let G be a (general) nondeterministic n-OBDD 
for g whieh uses at most r nondeterministic variables and has w accepting paths 
for each 1-input of g. Then it holds that 

[loglGIl >iVA^®(/)-r. 



3 Results 

Now we are ready to present the main result of the paper. We consider the 

function defined below. 

Definition 7. 

(1) Let k and n be positive integers and m := n [logn]. We define the 
function IND*: {0,1}^^™ ^ jOj 1} on the (disjoint) variable vectors s = 
(si, . . . , Sfcm), t = {ti, . . . ,tkm), and v = {vi, . . . ,Vkm)- The vectors s 
and t are “bit masks” by which variables from v are selected. Lf the vec- 
tors s and t do not contain exactly n and [logn] ones, resp., then we let 
IND*(s,t,ti) := 0. Otherwise, let p\ < ■ ■ ■ < Pn and qi < ■ ■ ■ < ^[logn] 
the positions of ones in the s- and t-vector, resp., and define 

IND;(s, t, v) := IND„ ((vp,, . . . ,VpJ, (vg, ,..., "Cgriogni )) ■ 

(£) Let N = Hkfm, to = (n + [logn]). We define the function MINDfc^„ (“mul- 
tiple index with bit masks”) on k variable vectors bi = where 

i = I, . . . ,k and s’’, t’, v’ G {0, 1}^™, by 

MINDfe, 6fe) ■.= mOl{s\t\v^)A---AmDl{s\t\v'’). 

Theorem 2. 

(1) The function MINDfe_„ can be represented by synchronous nondeterministic 
OBDDs using fcflogn] nondeterministic variables in size O(k'^n^logn); 

(2) every synchronous nondeterministic OBDD representing the func- 
tion MINDfc^n and using at most r nondeterministic variables has 
size 

Corollary 1. Let N = 3n^(n+ [logn]) {the input length o/MIND„_„). 

(1) The function MIND„_„ can be represented by synchronous nondeterministic 
OBDDs using n\\ogn~\ = (1/3) • (A^/3)^/^ log • (1 + o(l)) nondeterministic 
variables in size 0{N^^^logN); 

(2) every synchronous nondeterministic OBDD representing MIND„^„ with at 
most 0{logN) nondeterministic variables requires size 2^^^ Lt still re- 
quires size 2^^^ ^ ^ if only (1 — e) • (1/3) • (7V/3)^/^ log fV nondeterministic 
variables may be used, where e > Q is an arbitrarily small constant. 

This follows by substituting fc = n in Theorem 2 and some easy calculations (we 

have to omit the proof here) . 



Computing with Restricted Nondeterminism 



349 



Corollary 2. The function MIND„_„ with input length N requires exponential 
size in N for (general) nondeterministic OBDDs with O(logiV) nondeterminis- 
tic variables. Furthermore, it also requires exponential size for (general) nonde- 
terministic OBDDs with the restriction that no nondeterministic variable may 
appear after a usual variable on a path from the source to the sinks. 

Sketch of Proof. The first claim follows from the fact that it is possible to move 
all nondeterministic decisions to the top of a nondeterministic OBDD with only 
a polynomial increase of the size if the number of nondeterministic variables is 
at most O(logn). 

For the second claim, let G be a nondeterministic OBDD of the described 
type representing MIND„_„. Then G has a part consisting of nondeterministic 
nodes at the top by which one of the deterministic nodes of G is chosen. Let r(n) 
be the minimal number of nondeterministic variables needed for the nondeter- 
ministic choice of the nodes. Then < |G| . Now either r{n) > {1 — e)-n log n 
and we immediately obtain an exponential lower bound, or r(n) < (1 — e) • n log n 
and Theorem 2 can be applied. □ 

In the remainder of the section, we prove Theorem 2. We start with the easier 
upper bound. 

Proof of Theorem 2(1) — Upper Bound. The function MINDfc_„ is the conjunc- 
tion of k copies of the “masked index function” IND* . The essence of the proof 
is to construct sub-OBDDs for the k copies of IND* in MINDfe^„ and to combine 
these sub-OBDDs afterwards by identifying the 1-sink of the ith copy with the 
source of the (i -\- l)-th one, for i = 1, . . . ,k—l. Hence, it is sufficient to describe 
the construction of a nondeterministic OBDD for a single function IND*, 

We use the ordering 2 / 1 ,..., ypogn], S 2 ,t 2 ,V 2 , ..., SkmUk m; '^km in 

the synchronous nondeterministic OBDD G for IND*, where ?/i, ■ ■ ■ , 2/[iogn] are 
the nondeterministic variables. With a tree of nondeterministic nodes labeled by 
the //-variables at the top of G, a deterministic sub-OBDD Gd from Gi, . . . , G„ 
is chosen. The number c? € {1, . . . , n} is interpreted as a “guess” of the address 
for the index function IND„. 

In Gd, we count the number of ones in the s- and t- vector already seen in 
order to find the variables , . . . , Vp.^ and , . . . , Vqfiag„i for th® evaluation of 
IND* (see Def. 7). We compare the “real” address with the guessed address d 
and output the addressed bit in the positive case. It is easy to see how this 
can be done using 0{k{n -I- logn) • nlogn) nodes in Gd. Thus, the overall size 
of G is 0{kn^ logn). The OBDD for MINDfe^„ contains k copies of OBDDs of 
this type and therefore has size 0{k^n^ logn). □ 

We now turn to the proof of the lower bound. As a first idea, we could try to 
apply the reduction technique (Lemma 1). The following function appears to be 
a suitable candidate for a rectangular reduction to MINDfc,„. 

Definition 8. Define INDfe^^: {0,1}^" x — > {0,1} on inputs x'' = 

{x\,...,x\() andy'^ = {y\, ■ ■ ■ ,y\iogn^), where i= l,...,k, by 

INDfc.n ((a;\ . . . (//\ . . ■,//'')) := IND„(x\?/^) A • • • A IND„ (a;'", //'"). 



350 



Martin Sauerhoff 



We would like to consider nondeterministic one-way protocols for INDfc^„ ac- 
cording to the partition of the variables where Alice has (a;^, . . . ,a;^) and Bob 
, y^), i. e., where the variables of all copies of INDji are “split” between 
the players. It has been shown in [5] that in this case, the players essentially can- 
not do better than evaluate all k copies of IND„ independently which requires 
fcflogn] nondeterministic advice bits. 

Unfortunately, we cannot assume that the variable ordering of the given non- 
deterministic OBDD for MINDfc^„ allows a reduction according to this partition 
of the variables. It turns out that the described idea does not work for variable 
orderings of the following type. 

Definition 9. Let f: {0,1}^” — > {0,1} be defined on variable vectors = 
(a;}, . . . , x\fi, i = 1, . . . , fc. An ordering tt of the variables of f is called blockwise 
with respect to a;^, . . . , if there are permutations (6i, . . . , bk) o/ (1, . . . , fcj and 
(ji.i, • • ■ iji,n) of {1, . . . , n| for i = 1, . . . , fc such that tt is the ordering 



. 1 rpbl 

'il.l’ ■ • • ’“'il.n’ “'i2,l 



^2 



bk 



bk 



The a;* are called blocks in this context. For the ease of notation, we may as- 
sume that the blocks are simply ordered according to x^ , ,x^ imr and that the 
variables within each block are ordered as in the definition of x’’ . 

For MINDfe_„ we consider orderings which are blockwise with respect to (s®, t®, u®) 
if we ignore the nondeterministic variables. Such an ordering is used in the proof 
of the upper bound of MINDfc,„. One can verify that our first idea to reduce 
INDfc^n to MINDfe_„ does no longer work in the case of blockwise orderings of 
this type. 

Nevertheless, we claim that a synchronous nondeterministic OBDD for 
MINDfc^„ will become large if there are too few nondeterministic variables, even 
if we choose a blockwise ordering. First, we show that it is sufficient to consider 
blockwise variable orderings, which seem to constitute the “hardest case” for 
the proof of the lower bound. By the definition of MINDfe_„, we have ensured 
that we can select a blockwise subordering of an arbitrary variable ordering by 
fixing the bit mask vectors in an appropriate way. 



Lemma 2. Let G be a synchronous nondeterministic OBDD for ySKDk.n- Bet 
TT be the subordering of the usual variables in G. Then there are assignments to 
the s- and t-variables o/MINDfe_„ such that by applying these assignments to G 
one obtains a synchronous nondeterministic OBDD G' for the function INDfc^„ 
which is no larger than G, uses at most as many nondeterministic variables as G, 
and where the usual variables are ordered according to TTb described by 

j ■ ■ ■ ; 2 /l ; ■ ■ ■ 7 y [log n] 5 ■■■7 7 ■ ■ ■ 7 7 7 ■ ■ ■ 7 1 / [log n] 

after renaming the selected v-variables. 



Proof. Let L be the list of the w-variables of MINDfc^„ ordered according to tt. 
Only by deleting variables, we obtain a sublist of L where the variables appear 



Computing with Restricted Nondeterminism 



351 



in a blockwise ordering with respect to the u* as blocks. This is done in steps 
t = 1, . . . ,k. Let Lt be the list of variables we are still working with at the 
beginning of step t, and let mt be the minimum number of variables in all blocks 
which have not been completely removed in the list Lt- We start with L\ = L 
and TOi = km. 

In step t, we define p as the smallest index in Lt such that the sublist of 
elements with indices contains exactly m = n + [logn] variables of 

a block Define bt := i and choose indices jt,i, ■ ■ ■ , jt,m such that the vari- 
ables Vj* , Vj* ^ are under the first p variables of Lf. Afterwards, delete the 
first p variables and all variables of the block from Lt to obtain Lt+i- 

It is easy to verify that mt > {k — t + l)m for t = 1,. . . ,k, and hence, the 
above algorithm can really be carried out sufficiently often. Let 






,,v 



bi 



^2 



b2 






.,v 



bk 



be the obtained sublist of variables. For z = 1, . . . , /c, fix s* such that it contains 
ones exactly at the positions ji^i, . . . and fix L such that it contains ones 
exactly at the positions ji^n+i, ■ ■ ■ ^ ji,n+\\ogn] . It is a simple observation that, in 
general, assigning constants to variables may only reduce the OBDD size and 
the number of variables, and the OBDD obtained by the assignment (nondeter- 
ministically) represents the restricted function. □ 



Hence, we are left with the task to prove a lower bound on the size of syn- 
chronous nondeterministic OBDDs with “few” nondeterministic variables for the 
function IND^^n and the blockwise variable ordering ttu on the usual variables. 
Essentially, our plan is again to decompose the whole OBDD into sub-OBDDs 
which are responsible for the evaluation of the single copies of IND„. The follow- 
ing central lemma will be used for the decomposition. It is crucial for the proof 
of this lemma that we consider synchronous nondeterministic OBDDs. 

Lemma 3 (The Nondeterministic Partition Lemma). Let fn- {0,1}” ^ 
{0,1} be an arbitrary function, and let fk,n- {0,1}^" ^ {0,1} be defined on 
variable vectors , . . . ,x^ by fk,n{x^, ■ • ■ , '■= fn{x^) A • • • A fnix^)- Let 7Tb be 
a blockwise variable ordering with respect to a;^, . . . ,a;^. Let G be a synchronous 
nondeterministic OBDD for fk^n where the usual variables are ordered according 
to 7Tb and which uses r nondeterministic variables. 

Then there are synchronous nondeterministic OBDDs Gi and G 2 with the 
ordering 7Tb on the usual variables and numbers ri € {0, ...,r} and w G 
{1, . . . , 2”'^} such that: 

(1) |Gi| < |G| and IG 2 I < |G|; 

(2) Gi represents /„, uses at mostri nondeterministic variables, and there are 
at least w accepting paths in Gi for each 1-input of fn,' 

(3) G 2 represents fk-i,n and uses at most r — ri [logic] nondeterministic 
variables. 



Proof. Let G be as described in the lemma. We assume that the ordering TTb of 
the usual variables is given hy x\, ... ,x\, . . . , x\ , . . . , x^. Let r\ be the number 




352 



Martin Sauerhoff 



of nondeterministic variables tested before (thus, r — ri nondeterministic 
variables are tested after x^). Let and y^ be vectors with the nondeterministic 
variables tested before and after x^, resp. 

For the construction of Gi, we consider the set of nodes in G reachable by 
assignments to a;^ and y^. We replace such a node by the 0-sink, if the 1-sink 
of G is not reachable from it by assignments to a;^, . . . , and t/^, and by the 1- 
sink, otherwise. The resulting graph is called Gi. It can easily be verified that 
it represents /„ . We define w as the minimum of the number of accepting paths 
in Gi for a 1-input of /„. Thus Gi fulfills the requirements of the lemma. 

The synchronous nondeterministic OBDD G 2 is constructed as follows. 
Choose an assignment a G to x^ such that Gi has exactly w accept- 

ing paths for a. Let Ga be the nondeterministic OBDD on y^, x^, ... ,x^ and y^ 
obtained from G by fixing the a;^-variables to a. The top of this graph consists 
of nondeterministic nodes labeled by y^-variables. Call the nodes reached by 
assignments to y^ “cut nodes.” W. 1. o.g., we may assume that none of the cut 
nodes represents the 0-function. (Otherwise, we remove the node, as well as all 
nodes used to reach it and the nodes only reachable from it. This does not change 
the represented function.) 

By the choice of a and the above assumption, there are at most w paths be- 
longing to assignments to y^ by which cut nodes are reached, hence, the number 
of cut nodes is also bounded by w. Now we rearrange the top of the graph Ga con- 
sisting of the nodes labeled by j/^-variables such that only the minimal number 
of nondeterministic variables is used. Obviously, [log w] nondeterministic vari- 
ables are sufficient for this. Call the resulting graph G 2 . This is a synchronous 
nondeterministic OBDD which obviously represents fk-i,n and uses at most 
r — ri -|- [logw] nondeterministic variables. □ 

We use this lemma to complete the proof of the main theorem. 

Proof of Theorem 2(2) — Lower Bound. Let G be a synchronous nondetermin- 
istic OBDD for MINDfc^„ with r nondeterministic variables. We apply Lemma 2 
to obtain a synchronous nondeterministic OBDD G' for INDfe_„ with |G'| < |G|, 
at most r nondeterministic variables and the blockwise ordering 7Tb on the usual 
variables. The variable blocks are (a;®,?/*), where a;* = {x\, . . . ,xlf) and y® = 
iUlT ■■ i^pogn])- 

Define Sk,r{n) as the minimal size of a synchronous nondeterministic OBDD 
for INDfc^„ with at most r nondeterministic variables and the ordering 7Tb for the 
usual variables. We claim that 

[logSfc,.(n)l > 21/'=-^ • n • 2-”/'=. 

From this, we obtain the lower bound claimed in the theorem. We prove the above 
inequality by induction on k, using the Partition Lemma for the induction step. 
The required lower bounds on the size of sub-OBDDs will be derived by the 
reduction technique. 



Computing with Restricted Nondeterminism 



353 



Case fc = 1: By Theorem 1, 7V^^®(IND„) > n • 2 ^ + r. By a stan- 

dard application of the reduction technique, we immediately get [logsi_r(n)] > 
2-^-n-2~C 

Case k > 1: We assume that the claim has been shown for Sk-iy, for ar- 
bitrary r'. Let G' be a synchronous nondeterministic OBDD for IND^ „ with r 
nondeterministic variables and ordering ttu on the usual variables. 

We first apply the Partition Lemma to obtain synchronous nondeterminis- 
tic OBDDs G'l and G '2 with their usual variables ordered according to ttu and 
numbers ri and w with the following properties: 

• G'l represents INDji, uses at most r\ nondeterministic variables, and there 
are at least w accepting paths for each 1-input of IND„; 

• G '2 represents INDfc_i^„ and uses at most r — ri -|- [logw] nondeterministic 
variables. 

Furthermore, |G'i| < |G'| and \G' 2 \ < |G'|. By Theorem 1, > 

nw ■ -I- ri. Applying the reduction technique, we get a lower bound on 

\G'i\. Together with the induction hypothesis we have 

[loglGill and 

riog|G2|l > 2-G-’'i+riog’"l)/(fe-i). 

It follows that 

riogSfe,^(n)l > max{nw • ■ n ■ 2 -G-’'i+i°g^+i)/(fc-i)}, 

where we have removed the ceiling using [log w~\ > log w -I- 1 . The two functions 
within the maximum expression are monotonous increasing and decreasing in w, 
resp. Thus, the minimum with respect to w is attained if 

nw ■ 2“’'^“^ = • n ■ 

Solving for w, we obtain w = 2}^^~'^ ■ 2“’'/^ • 2’'i+^. By substituting this into the 
above estimate for [log Sfc_j.(n)] , we obtain the desired result, 

[logSfc,^(n)] > 2^/''“^ • n • 2“’'/''. 



□ 



Discussion of the Results and an Open Problem 



We have shown that the number of nondeterministic variables in nondetermin- 
istic OBDDs cannot be bounded by O(logn) without restricting the power of 
the model. Requiring that the nondeterministic variables are tested at the top 
of the graph may cause an exponential blow-up of the OBDD size. As a by- 
product, these results have also led to a deeper understanding of the structure 
of nondeterministic OBDDs. 



354 



Martin Sauerhoff 



A natural question which could not be investigated here due to the lack of 
space is how the synchronous and the general model of nondeterministic OBDDs 
are related to each other. It is easy to see that general nondeterministic OBDDs 
can be made synchronous (while maintaining polynomial size) by spending ad- 
ditional nondeterministic variables. Using the ideas in this paper, one can also 
show that there are functions for which this increase of the number of variables 
is indeed unavoidable. 

The next step can be to analyze the influence of the resource nondeterminism 
for more general types of branching programs in greater detail. It is already a 
challenging task to try to prove a tradeoff between the size of nondeterministic 
read-once branching programs and the number of nondeterministic variables. 
In a deterministic read-once branching program, each variable may appear at 
most once on each path from the source to one of the sinks. A nondeterministic 
read-once branching program fulfills this restriction for the usual and for the 
nondeterministic variables. 

Open Problem. Find a sequence of functions /„ : {0, 1}" — > {0, 1} such that /„ 
has polynomial size for unrestricted nondeterministic read-once BPs, but re- 
quires exponential size if only O(logn) nondeterministic variables may be used. 

One may again try the conjunction of several copies of a function which is easy for 
nondeterministic read-once branching programs, but a new approach is required 
for the proof of the respective lower bound. 

Acknowledgement 

Thanks to Ingo Wegener for proofreading and improving the upper bound of 
Theorem 2, to Juraj Hromkovic, Detlef Sieling, and Ingo Wegener for helpful 
discussions, and finally to the referees for careful reading and useful hints. 



References 

1. F. Ablayev. Randomization and nondeterminism are incomparable for polynomial 
ordered binary decision diagrams. In Proc. of the 2fth Int. CoU. on Automata, 
Languages, and Programming (ICALP), LNCS 1256, 195-202. Springer, 1997. 344 

2. R. E. Bryant. Graph-based algorithms for Boolean function manipulation. IEEE 
Trans. Computers, C-35(8):677-691, Aug. 1986. 344 

3. K. Hosaka, Y. Takenaga, and S. Yajima. On the size of ordered binary decision 
diagrams representing threshold functions. In Proc. of the 5th Int. Symp. on Al- 
gorithms and Computation (ISAAC), LNCS 834, 584 - 592. Springer, 1994. 344 

4. J. Hromkovic. Communication Complexity and Parallel Computing. EATCS Texts 
in Theoretical Computer Science. Springer, Berlin, 1997. 345 

5. J. Hromkovic and M. Sauerhoff. Communication with restricted nondeterminism 

and applications to branching program complexity. Manuscript, 1999. 345, 346, 

350 



Computing with Restricted Nondeterminism 



355 



6. J. Hromkovic and G. Schnitger. Nondeterministic communication with a limited 
number of advice bits. In Proc. of the 28th Ann. ACM Symp. on Theory of Com- 
puting (STOC), 551 - 560, 1996. 344, 347 

7. S. P. Jukna. Entropy of contact circuits and lower bounds on their complexity. 
Theoretical Computer Science, 57:113 - 129, 1988. 344 

8. M. Krause. Lower bounds for depth-restricted branching programs. Information 
and Computation, 91(1):1-14, Mar. 1991. 344 

9. I. Kremer, N. Nisan, and D. Ron. On randomized one-round communication com- 
plexity. In Proc. of the 27th Ann. ACM Symp. on Theory of Computing (STOC), 
596 - 605, 1995. 346 

10. E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University 
Press, Cambridge, 1997. 345 

11. I. Newman. Private vs. common random bits in communication complexity. In- 
formation Processing Letters, 39:67 - 71, 1991. 344 

12. A. A. Razborov. Lower bounds for deterministic and nondeterministic branching 
programs. In Proc. of Fundamentals of Computation Theory (FCT), LNCS 529, 
47-60. Springer, 1991. 343 

13. M. Sauerhoff. Complexity Theoretical Results for Randomized Branching Programs. 
PhD thesis, Univ. of Dortmund. Shaker, 1999. 344, 344, 344 

14. M. Sauerhoff. On the size of randomized OBDDs and read-once branching pro- 
grams for fc-stable functions. In Proc. of the 16th Ann. Symp. on Theoretical As- 
pects of Computer Science (STACS), LNCS 1563, 488-499. Springer, 1999. 344, 
347 

15. I. Wegener. The Complexity of Boolean Functions. Wiley- Teubner, 1987. 343 

16. I. Wegener. Branching Programs and Binary Decision Diagrams — Theory and 
Applications. Monographs on Discrete and Applied Mathematics. SIAM, 1999. To 
appear. 343 

17. A. C. Yao. Lower bounds by probabilistic arguments. In Proc. of the 2fth IEEE 
Symp. on Foundations of Computer Science (FOCS), 420 - 428, 1983. 347 



Lower Bounds for Linear Transformed 
OBDDs and FBDDs 

(Extended Abstract) 



Detlef Sieling* 

FB Informatik, LS 2, Univ. Dortmund, 
44221 Dortmund, Germany 
sieling@Ls2 . cs . uni-dortmund . de 



Abstract. Linear Transformed OBDDs (LTOBDDs) have been sug- 
gested as a generalization of OBDDs for the representation and manipu- 
lation of Boolean functions. Instead of variables as in the case of OBDDs 
parities of variables may be tested at the nodes of an LTOBDD. By this 
extension it is possible to represent functions in polynomial size that do 
not have polynomial size OBDDs, e.g., the characteristic functions of lin- 
ear codes. In this paper lower bound methods for LTOBDDs and some 
generalizations of LTOBDDs are presented and applied to explicitly de- 
fined functions. By the lower bound results it is possible to compare the 
set of functions with polynomial size LTOBDDs and their generalizations 
with the set of functions with polynomial size representations for many 
other restrictions of BDDs. 



1 Introduction 

Branching Programs or Binary Decision Diagrams (BDDs) are a representation 
of Boolean functions with applications in complexity theory and in programs for 
hardware design and verification as well. In complexity theory branching pro- 
grams are considered as a model of sequential computation. The goal is to prove 
upper and lower bounds on the branching program size for particular Boolean 
functions in order to obtain upper and lower bounds on the sequential space com- 
plexity of these functions. Since for unrestricted branching programs no method 
to obtain exponential lower bounds is known, a lot of restricted variants of 
branching programs has been considered; for an overview see e.g. Razborov [16]. 

In hardware design and verification data structures for the representation and 
manipulation of Boolean functions are needed. The most popular data structure 
are Ordered Binary Decision Diagrams (OBDDs), which were introduced by 
Bryant [3]. They allow the compact representation and the efficient manipula- 
tion of many important functions. However, there are a lot of other important 
functions for which OBDDs are much too large. For this reason many generaliza- 
tions of OBDDs have been proposed as a data structure for Boolean functions. 

* The author was supported in part by DFG grant We 1066/8. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 356—368, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Lower Bounds for Linear Transformed OBDDs and FBDDs 



357 



Most of these generalizations are restricted branching programs that are also in- 
vestigated in complexity theory. Hence, the lower and upper bound results and 
methods from complexity theory are also useful in order to compare the classes 
of functions for which the different extensions of OBDDs have polynomial size. 

In this paper we consider several variants of branching programs that are 
obtained by introducing linear transformations in OBDDs or generalizations 
of OBDDs. In order to explain the differences between ordinary OBDDs and 
OBDDs with linear transformations we first repeat the definition of BDDs or 
branching programs, respectively, and of OBDDs. A Binary Decision Diagram 
(BDD) or Branching Program for a function /(xi, . . . ,x„) is a directed acyclic 
graph with one source node and two sinks. The sinks are labeled by the Boolean 
constants 0 and 1. Each internal node is labeled by a variable Xi and has an 
outgoing 0-edge and an outgoing 1-edge. For each input a = (oi, . . . , a„) there 
is a computation path from the source to a sink. The computation path starts 
at the source and at each internal node labeled by Xi the next edge of the 
computation path is the outgoing a^-edge. The label of the sink reached by the 
computation path for a is equal to /(a). The size of a branching program or 
BDD is the number of internal nodes. In OBDDs the variables have to be tested 
on each computation path at most once and according to a fixed ordering. 

In the following we call an expression x,(i) © • • • © xn^k) a linear test. A 
generalized variable ordering over xi, . . . , x„ is a sequence of n linear independent 
linear tests. In Linear Transformed OBDDs (LTOBDDs) the internal nodes may 
be labeled by linear tests instead of single variables as in the case of OBDDs. 
However, on each computation path the tests have to be arranged according to a 
fixed generalized variable ordering. The function / represented by an LTOBDD 
is evaluated in the obvious way: The computation path for some input a = 
(oi, . . . , On) starts at the source. At an internal node labeled by Xi(i) © • • • ©Xi(fe) 
the outgoing edge labeled by 0 ,( 1 ) © • • • © aj(fe) has to be chosen. The label of the 
sink at the end of the computation path is equal to /(a). 

In Section 2 we present an alternate definition of LTOBDDs. There we also 
define several extensions of LTOBDDs. An example of an LTOBDD is shown in 
the left of Figure 1. We remark that the linear independence of the linear tests 
of a generalized variable ordering is necessary, since otherwise not all inputs can 
be distinguished by the LTOBDD so that not all functions can be represented. 

The evaluation of linear tests instead of single variables at the nodes of BDDs 
was already suggested by Aborhey [1] who, however, only considers decision 
trees. LTOBDDs have been suggested as a generalization of OBDDs (Meinel, 
Somenzi and Theobald [14]), since they are a more compact representation of 
Boolean functions than OBDDs. The results of Bern, Meinel and Slobodova [2] 
on Transformed BDDs imply that the algorithms for the manipulation of OBDDs 
can also be applied to LTOBDDs so that existing OBDD packages can easily 
be extended to LTOBDDs. Gunther and Drechsler [7] present an algorithm for 
computing optimal generalized variable orderings. 

An example that shows the power of LTOBDDs are the characteristic func- 
tions of linear codes. It is easy to see that all characteristic functions of linear 



358 Detlef Sieling 




Fig. 1. An example of an LTOBDD with the generalized variable ordering xi 0 
X 2 © 2 ^ 3 , © X 2 , xa, x\ © X 3 for some function / and of an OBDD for some 

function g so that f(x) = g(A ■ x). 



codes can be represented by LTOBDDs of linear size: In order to check whether 
a word x belongs to a linear code it suffices to test whether the inner product 
of X and each row of the parity check matrix of the code is equal to 0. For each 
row we can choose a linear test that is equal to the inner product of the row and 
the input. Since the rows of the parity check matrix are linearly independent, we 
can choose these linear tests as a generalized variable ordering of an LTOBDD, 
and an LTOBDD computing the NOR of these linear tests also computes the 
characteristic function of the code. On the other hand, exponential lower bounds 
on the size of many restrictions of branching programs are known for the charac- 
teristic functions of certain linear codes: Exponential lower bounds for syntactic 
read-fc-times branching programs are proved by Okol’nishnikova [15], for nonde- 
terministic syntactic read-fc-times branching programs by Jukna [9], for semantic 
(1, +fc)-branching programs by Jukna and Razborov [11], and for ©OBDDs by 
Jukna [10]. 

The aim of this paper is to present methods to prove exponential lower 
bounds on the size of LTOBDDs and some generalizations of LTOBDDs. Many 
lower bounds for restricted BDDs have been proved by arguments based on com- 
munication complexity theory. Roughly, the BDD is cut into two parts so that 
some part of the input is known only in the first part of the BDD and the other 
part of the input only in the second part of the BDD. If the computation of 
the considered function requires the exchange of a large amount of information 
between those two parts of the input, the cut through the BDD and, therefore, 
also the BDD has to be large. For LTOBDDs this approach is more difficult 
to apply. If in one part of the LTOBDD x\ © X 2 is tested and in the other 
part X\ © 2 : 3 , one can hardly say that nothing about X\ is known in one of the 




Lower Bounds for Linear Transformed OBDDs and FBDDs 



359 



parts of the LTOBDD. The main result of this paper is to show how to overcome 
this problem. 

The paper is organized as follows. In Section 2 we repeat the definitions of 
several variants of BDDs and define the corresponding variants of LTOBDDs. 
In Section 3 we present lower bound methods for LTFBDDs and in Section 4 for 
©LTOBDDs. Finally, we summarize our results and compare the classes of func- 
tions with polynomial size LTOBDDs with the corresponding classes for other 
variants of BDDs. Due to lack of space all upper bound proofs and some details 
of the lower bound proofs are omitted and can be found in the full version [17]. 

2 Further Definitions 

Before we generalize the definition of LTOBDDs we discuss an alternate defi- 
nition of LTOBDDs, which is equivalent to that given in the Introduction and 
will be useful in our lower bound proofs. In order to simplify the notation we 
always assume that vectors are column vectors and we also use vectors as argu- 
ments of functions. Furthermore we only consider vector spaces over IF 2 . Then 
an LTOBDD for some function / consists of an OBDD for some function g and 
a nonsingular matrix A, so that f{x) = g{A-x). In order to illustrate this defini- 
tion Figure 1 shows an LTOBDD for some function / and an isomorphic OBDD 
for some function g, so that f{x) = g{A ■ x) for 



We now define some generalizations of OBDDs and the linear transformed 
variants of these generalizations. In FBDDs (Free BDDs, also called read-once 
branching programs) on each computation path each variable is tested at most 
once. This property is also called the read-once property. An LTFBDD for some 
function / consists of an FBDD for some function g and a nonsingular matrix A 
so that f{x) = g{A ■ x). If we draw LTFBDDs as FBDDs with linear tests at 
the internal nodes, we see that at most n different linear tests may occur in 
an LTFBDD. Another possibility to define LTFBDDs is to allow an arbitrary 
number of different linear tests. We call the resulting variant of LTFBDDs strong 
LTFBDDs: In a strong LTFBDD the linear tests of each computation path have 
to be linearly independent. This definition is quite natural, since the term “free” 
in the name FBDD means that a path leading from the source to some node v 
can be extended to a computation path (a path corresponding to some input) via 
the 0-edge leaving v and the 1-edge as well. In a BDD this is obviously equivalent 
to the read-once property. In linear transformed BDDs this is possible iff on each 
path the linear tests performed on this path are linearly independent. 

Obviously, an LTFBDD is also a strong LTFBDD, while the opposite is 
not true. We shall even see in the following section that polynomial size strong 




360 Detlef Sieling 



LTFBDDs are more powerful than polynomial size LTFBDDs so that the name 
strong LTFBDD is justified. 

A nondeterministic OBDD is an OBDD where each internal node may have 
an arbitrary number of outgoing 0-edges and 1-edges. Hence, for each input 
there may be more than one computation path. A function represented by a 
nondeterministic OBDD takes the value 1 on the input a, if there is at least 
one computation path for a that leads to the 1-sink. 0OBDDs are syntactically 
defined as nondeterministic OBDDs. However, a 0OBDD computes the value 1 
on the input a, if the number of computation paths for a from the source to 
the 1-sink is odd. We may define nondeterministic LTOBDDs and 0LTOBDDs 
by introducing a nonsingular transformation matrix A as described above or by 
allowing linear tests at the internal nodes, where a generalized variable ordering 
has to be respected. 

The investigation of 0OBDDs is motivated by polynomial time algorithms 
for several important operations on Boolean functions, which are presented by 
Gergov and Meinel [5] and Waack [18]. It is straightforward to extend most of 
these algorithms to 0LTOBDDs. We shall see that polynomial size 0LTOBDDs 
can represent a larger class of functions than 0OBDDs. On the other hand, we 
also obtain exponential lower bounds for 0LTOBDDs. 

3 Lower Bounds for LTFBDDs and a Comparison of 



Lower bounds for FBDDs can be proved by cut-and-paste arguments as shown 
by Wegener [19] and Zak [20]. The following lemma, which we present without 
proof, describes an extension of the cut-and-paste method that is suitable for 
LTFBDDs. 

Lemma 1. Let f : {0, 1}" — > {0, 1} and let k > 1. If for all k x n matrices D 
with linearly independent rows, for all c = {cq, . . . ,Ck-i) € {0,1}^ and for all 
z G {0,1}", z yf (0,...,0), there is an x G {0,1}" so that D - x = c and 
f{x) ^ f{x 0 z), the LTFBDD size for f is bounded below by — 1. 

We show how to apply this method to a particular function. We call the 
function defined in the following the matrix storage access function MSA. We 
remark that a similar function was considered by Jukna, Razborov, Savicky and 
Wegener [12]. Let n be a power of 2 and let b = [[(n — l)/lognJ . Let the 
input xq, . . . , Xn-i be partitioned into xg, into t = log n matrices C°, . . . , 
of size bxb and possibly some remaining variables. Let Si(x) = 1 if the matrix C® 
contains a row consisting of ones only, and let Si(x) = 0 otherwise. Let s(x) be 
the value of (s 4 _i(x), . . . , so(x)) interpreted as a binary number. Then 



The following upper and lower bound results for MSA show that polynomial size 
strong LTFBDDs are more powerful than polynomial size LTFBDDs. 



LTFBDDs and Strong LTFBDDs 




Lower Bounds for Linear Transformed OBDDs and FBDDs 



361 



Theorem 2. There is a strong LTFBDD for MSA with 0(n^/ log n) nodes. 
Theorem 3. LTFBDDs for MSA have size 

Sketch of Proof. It suffices to show that for k = b— 2 the assumptions of Lemma 1 
are fulfilled. Let a fc x n-matrix D, a vector c = {cq, . . . ,Ck-i) and a vector 
z S {0, 1}”, 2 (0, . . . , 0), be given. We are going to construct an input x for 

which D ■ X = c and MSA{x) MSA{x © z). 

If zo = 1, we choose s* = 0. Otherwise we choose some s* for which Zs* =1. 
We shall construct an input x so that s{x) = s* and s{x © z) = s* as well. 
If s* = 0, it holds that MSA{x) = xq ^ xq ® zq = MSA{x © z) and, if s* 0, 
we have MSA{x) = xo © Xs(x) ^ (xo © zo) © (a:s(£c© 0 ) © Zs(xez)) = MSA(x © z) 
as required. 

Let (S(_i, . . . , Sq) be the representation of the chosen value for s* as a binary 
number. For i = 0, ... — 1 we successively construct linear equations of the 

form Xj = 0 or Xj = 1, which make sure that for x and x ® z the matrix C® 
contains a row consisting of ones only, if s* = 1, or that C® contains a column 
consisting of zeros only, if s* =0. Hence, for a solution of the system of equations 
the number s{x) takes the value s* and MSA{x) MSA{x®z). However, we also 
have to make sure that the equations D-x = c are fulfilled. Hence, we shall choose 
the equations Xj = 0 or = 1 in such a way that the vectors of coefficients of 
all equations together with the rows of D are a linearly independent set. Then 
there is a solution x for the system of all considered linear equations so that 
D ■ X = c and MSA{x) yf MSA{x © z). 

For each i = 0,...,t — 1 we inductively construct 2b equations where the 
left-hand-sides are single variables from the matrix CL These equations make 
sure that Si(x) takes the value s*. W.l.o.g. let s* = 1. By rank arguments it can 
be shown that we can choose two rows of C* so that the set of 2b equations, 
which have as left-hand side the single variables of those two rows, together with 
all previously constructed equations are linearly independent. For one of those 
rows we choose equations that make sure that for the input x this row is a row 
only consisting of entries 1. Similarly for the other row equations are chosen that 
make sure that for the input x® z this row is a row only consisting of entries 1 . 
Here we remember that z is fixed. If s* = 0 by the same arguments two column 
can be constructed so that the first one only consists of entries 0 for x and the 
second one only consists of entries 0 for a; © z. 

Altogether, we obtain a system of linear equations where the set of vectors of 
coefficients is linearly independent. Let a; be a solution. Then the linear equations 
enforce that D ■ x = c, that s{x) = s{x © z), and, by the case distinction above, 
that MSA{x) yf MSA{x © z). □ 

4 Lower Bounds for LTOBDDs, 0LTOBDDs and 
Nondeterministic LTOBDDs 

LTOBDDs, ©LTOBDDs and nondeterministic LTOBDDs have in common that 
they respect a generalized variable ordering. Hence, we shall apply communi- 



362 Detlef Sieling 



cation complexity based arguments in order to prove lower bounds. For an in- 
troduction into communication complexity theory we refer to the monographs 
of Hromkovic [8] and Kushilevitz and Nisan [13]. We are going to prove lower 
bounds on the communication complexity by constructing large fooling sets. In 
order to introduce the notation we repeat the definition of fooling sets. 

Definition 4. Let f : {0,1}" — *■ {0,1} be a Boolean function. Let (L,R) be a 
partition of the set of input variables. For an input x let denote the assign- 
ment to the variables in L according to x and let x^'^'l denote the assignment to 
the variables in R according to x. Let be the input consisting of x^^'^ 

and . A fooling set for f and the partition (L, R) of the input variables is a 
set M C {0, 1}" of inputs which has for some c S {0, 1} the following properties. 

1. Va; S M : f{x) = c. 

2. 'ix,y e M,x^y: [f {x^'^'l , y^^'>) = c] V [f{y^^'>,x^^'>) = c]. 

We say that M is a strong fooling set if it has the following property 2' instead 
of property 2 from above. 

2' . yx,y € M,x^y : [/(a;('\ = c] A [/(y('\ = cj. 

We call M a 1-fooling set or strong 1-fooling set, respectively, if it has the above 
properties for c = 1 . 

It is well-known that the size of a fooling set for a function / and a partition 
(L,R) is a lower bound on the size of OBDDs for / and all variable orderings 
where the variables in L are arranged before the variables in R. However, in an 
LTOBDD for / and the transformation matrix A the function g{y) = f{A~^ ■ y) 
is represented. Hence, we have to construct large fooling sets for g in order 
to obtain lower bounds on the LTOBDD size for /. In order to simplify the 
notation let B = A~^ throughout this section. Furthermore, let the number n of 
variables be an even number. We always partition the set {yo, ... , y„_i}, which g 
depends on, into L = {yo, ..., yn/ 2 - 1 } and R = {yn/ 2 , • ■ • , Vn-i}- Furthermore, 
we use the notation y^'’'> and to denote (yo, . . . , yn/ 2 - 1 ) and (y„/ 2 , ■ ■ . , yn-i), 
respectively. We shall apply the following lemmas to prove the lower bounds. 
Lemma 6 is inspired by the presentation of Dietzfelbinger and Savicky [4] . The 
proofs of the lemmas are omitted. 

Lemma 5. Lf for all nonsingular matrices B there is a fooling set of size at 
least b for the function g{y) = f{B ■ y) and the partition (L,R), the LTOBDD 
size for f and all generalized variable orderings is at least b. 



Lemma 6. Lf for all nonsingular matrices B there is a fooling set of size at 
least b for the function g{y) = f(B ■ y) and the partition [L, R), the (BLTOBDD 
size for f and all generalized variable orderings is at least 6^/^ — 1. If for all 
nonsingular matrices B there is even a strong fooling set of size at least b' for 
the function g{y) = f{B ■ y) and the partition {L,R), the (BLTOBDD size for f 
and all generalized variable orderings is at least b' . 



Lower Bounds for Linear Transformed OBDDs and FBDDs 



363 



Lemma 7. If for all nonsingular matrices B there is a 1-fooling set of size at 
least b for the function g{y) = f(B ■ y) and the partition (L,R), the nondeter- 
ministic LTOBDD size for f and all generalized variable orderings is at least b. 

It remains the problem to apply the lemmas to obtain an exponential lower 
bound for an explicitly defined function. In the following we define the function 
INDEX-EQ, a combination of the functions INDEX and EQ, which are both 
well-known functions in communication complexity theory. We get lower bounds 
on the size of LTOBDDs, ©LTOBDDs and nondeterministic LTOBDDs by con- 
structing large fooling sets which are even simultaneously strong fooling sets 
and 1-fooling sets. 

Definition 8. Let k be a power of 2 and let N = 2^ . The function INDEX-EQ 
is defined on n = 3N/2 variables xq, . . . ,Xn-i- The variables xq, ■ ■ ■ , xjv-i are 
interpreted as a memory and the N/2 variables xn, . . . ,x„_i are interpreted as 
7V/(21ogA^) pointers each consisting of log N bits. Let m = A^/(41ogfV). Let 
a(l), . . . , a(m), 6(1), . . . , 6(m) denote the values of the pointers. Then 
INDEX-EQ{xo, . . . ,Xn-i) takes the value 1 iff the following conditions hold. 

1. Vi G m} : Xa(^) =Xt(i). 

2. a(l) < • • • < a{m) and 6(1) < • • • < 6(m). 

3. afm) < 6(1) or b{m) < a(l). 

Because of the first condition the computation of the function includes the 
test whether the words whose bits are addressed by the pointers are equal. The 
second and the third condition ensure that the equality test has only to be 
performed if the pointers are ordered and if either all a-pointers are smaller 
than all 6-pointers or vice versa. We remark that the last two conditions are 
not necessary for the proof of the lower bound. These conditions allow to prove 
a polynomial upper bound on the FBDD size of INDEX-EQ, which we state 
without proof. Afterwards we prove the lower bound. 

Theorem 9. There are FBDDs of size 0{n^) for INDEX-EQ. 

Theorem 10. The size of LTOBDDs, ®LTOBDDs and nondeterministic 
LTOBDDs for the function INDEX-EQ is bounded below by 

Sketch of Proof. By Lemmas 5-7 it suffices to show that for all nonsingular 
nxn matrices B there is a strong 1-fooling set of size at least 2™ for the function 
g{y) = INDEX-EQ^{B-y) and the partition ({?/o, • ■ • , 2 /n/ 2 -i}, {yn/ 2 , • ■ • , yn-i})- 
The construction of the strong 1-fooling set essentially consists of the following 
steps. In the first one we construct sets I and J of indices of memory variables. 
In the second step we construct a set which will be the fooling set. The set / has 
the property that for all inputs of the fooling set the values of the variables with 
indices in / only depend on the results of the linear tests in {y^, . . . ,yn/ 2 -i}- 
Similarly, for the inputs of the fooling set the values of the memory variables 
with indices in J only depend on the linear tests in , J/n-i}- Finally, it 

has to be shown that the constructed set is really a fooling set. 



364 Detlef Sieling 



Let B a nonsingular n x n matrix. We always keep in mind that B is the 
matrix for which {xq, . . . ,Xn-i) = B ■ (yo, • ■ • , J/n-i)- In particular, each row 
of B corresponds to one of the x-variables and each column of B to one of 
the y-variables. 

We use the following notation. Let be the left half of B, i.e., the n x n/2 
matrix consisting of the first n/2 elements of each row of B. Similarly, let B^'^'> 
be the right half of B. Let B[xi] denote the tth row of B (the row corresponding 
to Xi), and let B^^^[xi] and B^^'>[xi] be the left and right half of this row, re- 
spectively. Let each pointer a{i) consist of the k = logfV bits ak-i{i), . . . ,ao(i) 
which are interpreted as a binary number. Similarly let each pointer b(z) consist 
of the bits bk-i(i), ■ ■ ■ ,bo(i). We shall use the notations Xj and ai{i) simulta- 
neously even if both denote the same bit. Then B[ai{i)] denotes the row of B 
corresponding to the bit ai{i) of the input. 

The choice of / and J can to be done in such a way that / and J have the 
following properties. We omit the details how to choose I and J. 

(PI) |/| = iV/16, I J| = A^/16 and /, J C {0, . . . , fV - 1}. 

(P2) The set {B^^^[xi] | f € /} is linearly independent and 

span{i?(*)[a;i]) | i G J}nspan{i3(^) [x^] | j G JVfV < j < n— 1} = {(0, . . . ,0)}. 
(P3) The set {B^^^ [xj] I j G J} is linearly independent and 

span{i3<^'’)[xj] | j G J}nspan{i?('’)[xi] | i G IV N <i< n— 1} = {(0, . . . ,0)}. 

We shall apply property (P2) (and similarly (P3)) in order to prove that 
a system of linear equations whose vectors of coefficients are where 

i € I U J U {j \ N < j < n — 1}, has a solution. (P2) and (P3) also imply that 

/nj = 0. 

Let i* be the N/32 smallest element of / and let j* be the N/32 smallest 
element of J. If i* < j*, we choose for I* the m smallest elements of I and for J* 
the m largest elements of J. W.l.o.g. k > 32. Then m < N/32 and all elements 
of I are smaller than all elements of J. If i* > j*, we choose for I* the m 
largest elements of I and for J* the m smallest elements of J. Then all elements 
of I are larger than all elements of J. Let /* = {i(l), . . . , f(m)} and J* = 
{j(l), ■ • ■ ,j{m)} such that f(l) < • • • < i{m) and j(l) < • • • < j(m). We shall 
only construct inputs where the chosen addresses a(l), . . . , a{m), 6(1), . . . , b{m) 
are equal to i(l), . . . ,f(m), j(l), . . . Hence, for all considered inputs the 

second and the third condition of the definition of INDEX-EQ are fulfilled so 
that the value of INDEX-EQ on these inputs only depends on the first condition. 

Let and j,y(a) denote the vth bit of i{a) and j{a), respectively. Let 

s = (so, ■ ■ ■ , Sn-i) be an arbitrary solution of the system of linear equations 
that consists for all a G {1, ... , mj and all G {0, . . . , /c — 1} of the following 
equations. 

B[a^(a)] ■ s = . 

B[bi,{a)] ■ s = j^{a) 

This system of linear equations has a solution since all rows of B are linearly 
independent. 




Lower Bounds for Linear Transformed OBDDs and FBDDs 



365 



Now we construct the fooling set. For all (c(l), . . . , c{m)) G {0, 1}™ we con- 
struct a system of linear equations. We shall prove that this system has at least 
one solution. We select an arbitrary solution and include it into the fooling 
set M . It can be shown that for different assignments to c(l), . . . , c{m) we get 
different systems of linear equations which have disjoint sets of solutions so that 
we obtain a set M of size 2"^ . 

In the following system of linear equations the only variables are denoted 
by y, all other identifiers denote constants. The linear equations are arranged 
as a table which shows the connections between the different equations. Note 
that the left column only contains variables in and the right column only 
variables in y^^\ Hence, we may also consider the equations as two independent 
systems, one determining the values of and the other one the values of 

1®* block: For all a £ {!,..., m}: 

= c(a) and 

2"'^ block: For all a G {1, . . . , m}: 

and B^^'>[xj(a)]y^'’^ = c{a) © B^‘'>[xj^c,)]s^‘^ 

block: For all a € {1, ... , m} and for all i/ € {0, . . . , A: — 1}: 

[a,(a)]s<*) and [a,(Q)]s(") 

B(*)[64a)]j/('> = B(*)[6,(a)]s(*) and BW[fe,(a)]i/(") = B^^'>[b,{a)]s''’~'> 

We only summarize the remaining steps of the proof. Using the Properties 
(P2) and (P3) it can be shown that the system of linear equations has a solution. 
The claim that M is a strong 1-fooling set can be proved by combining the system 
of equations of the definition of the fooling set, the definition of B and the system 
of equations (1). □ 



5 A Comparison of Complexity Classes of Linear 
Transformed BDDs 

Let P-LTOBDD, P-LTFBDD, P-sLTFBDD, NP-LTOBDD and ©P-LTOBDD 
denote the classes of functions with polynomial size LTOBDDs, polynomial size 
LTFBDDs, polynomial size strong LTFBDDs, polynomial size nondeterminis- 
tic LTOBDDs and polynomial size ©LTOBDDs, respectively. Let P-OBDD, P- 
FBDD, ... be defined similarly. In Figure 2 some inclusions between these classes 
are summarized. H — > B means that H is a proper subset of B, and a dotted lines 
between classes A and B means that these classes are not comparable, i.e., A % B 
and B ^ A. The numbers in the figure refer to the following list of functions 
proving that the corresponding inclusion is proper or proving that the classes 
are not comparable. In order to make Figure 2 clearer, the relations between 
P-LTOBDD and NP-LTOBDD and some related classes are drawn separately. 

First we remark that it is easy to see that all inclusions shown in Figure 2 
hold. Besides the functions mentioned in the following, in the literature a lot 
of functions can be found that witness that the inclusions (1), (3) and (15) 



366 Detlef Sieling 



®P-BDD 

( 14 ) 



P-sLTFBDD 



( 13 ) 



®P-LTOBDD P-LTFBDD 




NP-BDD 



( 19 ) 

NP-LTOBDD 




Fig. 2. Comparison of the complexity classes of polynomial size LTOBDDs and 
related BDD variants. 



are proper. Our results on the function MSA prove that the inclusions (1), (7), 
(13), (15) and (18) are proper. The polynomial upper bounds for nondetermin- 
istic OBDDs and 0OBDDs are straightforward, and the upper bound for strong 
LTFBDDs is stated in Theorem 2. The lower bounds are stated in Theorem 3. 
The characteristic functions of linear codes prove that the inclusions (2), (6), 

(11) and (17) are proper. The upper bound and references for the lower bounds 
are given in the Introduction. In particular the exponential lower bound for non- 
deterministic OBDDs follows from the fact that nondeterministic OBDDs are 
simultaneously (syntactic) nondeterministic read-fc-times branching programs. 
We remark that Gunther and Drechsler [6] presented a different function to 
prove that (2) is a proper inclusion. Their results implicitly imply that also 
(6) and (17) are proper inclusions. Our results for INDEX-EQ prove that (3), 
(10), (14) and (19) are proper inclusions (Theorems 9 and 10). Here we use the 
fact that INDEX-EQ has polynomial size 0BDDs and polynomial size nonde- 
terministic BDDs because it has polynomial size FBDDs. It remains to discuss 
the incomparability results. (4) and (16) follow from the bounds on MSA and 
on the characteristic functions of linear codes, (5) follows from the bounds on 
INDEX-EQ and the characteristic functions of linear codes, and (8), (9) and 

(12) follow from our results on INDEX-EQ and MSA. 

We conclude that the methods presented in this paper allow to prove expo- 
nential lower bounds for several variants of linear transformed BDDs. In par- 
ticular, it is possible to separate the classes of functions with polynomial size 
representations for many variants of linear transformed BDDs. It remains an 
open problem to prove exponential lower bounds for strong LTFBDDs and to 
prove exponential lower bounds on the size of LTOBDDs for practically impor- 
tant functions like multiplication. 



Lower Bounds for Linear Transformed OBDDs and FBDDs 



367 



Acknowledgment 

I thank Beate Bollig, Rolf Drechsler, Wolfgang Gunther, Martin Sauerhoff, 

Stephan Waack and Ingo Wegener for fruitful discussions and helpful comments. 

References 

1. Aborhey, S. (1988). Binary decision tree test functions. IEEE Transactions on 
Computers 37, 1461-1465. 357 

2. Bern, J., Meinel, C. and Slobodova, A. (1995). Efficient OBDD-based Boolean 
manipulation in CAD beyond current limits. In Proc. of 32nd Design Automation 
Conference, 408-413. 357 

3. Bryant, R.E. (1986). Graph-based algorithms for Boolean function manipulation. 
IEEE Transactions on Computers 35, 677-691. 356 

4. Dietzfelbinger, M. and Savicky, P. (1997). Parity OBDDs cannot represent the 
multiplication succinctly. Preprint Universitat Dortmund. 362 

5. Gergov, J. and Meinel, C. (1996). Mod-2-OBDDs — a data structure that general- 
izes EXOR-sum-of-products and ordered binary decision diagrams. Formal Meth- 
ods in System Design 8, 273-282. 360 

6. Gunther, W. and Drechsler, R. (1998). BDD minimization by linear transforma- 
tions. In Proc. of Advanced Computer Systems, Szczecin, Poland, 525-532. 366 

7. Gunther, W. and Drechsler, R. (1998). Linear transformations and exact mini- 
mization of BDDs. In Proc. of IEEE Great Lakes Symposium on VLSI, 325-330. 
357 

8. Hromkovic, J. (1997). Communication Complexity and Parallel Computing. 
Springer. 362 

9. Jukna, S. (1995). A note on read-A: times branching programs. RAIRO Theoretical 
Informatics and Applications 29, 75-83. 358 

10. Jukna, S. (1999). Linear codes are hard for oblivious read-once parity branching 
programs. Information Processing Letters 69, 267-269. 358 

11. Jukna, S. and Razborov, A. (1998). Neither reading few bits twice nor reading 
illegally helps much. Discrete Applied Mathematics 85, 223-238. 358 

12. Jukna, S., Razborov, A., Savicky, P. and Wegener, I. (1997). On P versus NPHco- 
NP for decision trees and read-once branching programs. In Proc. of Mathematical 
Foundations of Computer Science, LNGS 1295, 319-326. 360 

13. Kushilevitz, E. and Nisan, N. (1997). Communication Complexity. Gambridge Uni- 
versity Press. 362 

14. Meinel, C., Somenzi, F. and Theobald, T. (1997). Linear sifting of decision dia- 
grams. In Proc. of 34 th Design Automation Conference, 202-207. 357 

15. Okol’nishnikova, E.A. (1991). On lower bounds for branching programs. Metody 
Diskretnogo Analiza 51, 61-83 (in Russian). English Translation in Siberian Ad- 
vances in Mathematics 3, 152-166, 1993. 358 

16. Razborov, A. A. (1991). Lower bounds for deterministic and nondeterministic 
branching programs. In Proc. of Fundamentals of Computing Theory, LNGS 529, 
47-60. 356 

17. Sieling, D. (1999). Lower bounds for linear transformed OBDDs and FBDDs. 
Preprint Universitat Dortmund. 359 

18. Waack, S. (1997). On the descriptive and algorithmic power of parity ordered bi- 
nary decision diagrams. In Proc. of Symposium on Theoretical Aspects of Computer 
Science, LNGS 1200, 201-212. 360 



368 Detlef Sieling 



19. Wegener, I. (1988). On the complexity of branching programs and decision trees for 
clique functions. Journal of the Association for Computing Machinery 35, 461-471. 
360 

20. Zak, S. (1984). An exponential lower bound for one-time-only branching programs. 
In Proc. of Mathematical Foundations of Computer Science, LNCS 176, 562-566. 
360 



A Unifying Framework for Model Checking 
Labeled Kripke Structures, Modal Transition 
Systems, and Interval Transition Systems 



Michael Huth 

Department of Computing and Information Sciences, 
Kansas State University, Manhattan, KS 66506-2302, 
huthScis . ksu.edu 
WWW . cis . ksu . edu/'huth 



Abstract. We build on the established work on modal transition sys- 
tems and probabilistic specifications to sketch a framework in which sys- 
tem description, abstraction, and finite-state model checking all have a 
uniform presentation across various levels of qualitative and quantita- 
tive views together with mediating abstraction and concretization maps. 
We prove safety results for abstractions within and across such views for 
the entire modal mu-calculus and show that such abstractions allow for 
some compositional reasoning with respect to a uniform family of process 
algebras a la CCS. 



1 Introduction and Motivation 

Process algebras such as Milner’s CCS [16] and modular guarded command 
languages such as McMillan’s SMV [15] are important description languages for 
a wide range of computer systems. The operational meaning of such descriptions 
is typically captured by a triple A4 = (S, R, L), where S' is a set of states, R the 
state-transition relation, and L contains atomic state information; the latter is 
usually trivial in an event-based setting. The analysis of such descriptions can be 
done in a variety of ways. In model checking [3,20], the idea is to have a finite- 
state system M and a specification ^ in some temporal logic, together with an 
efficient algorithm for deciding whether cj) holds for all initial states of Ai. Due 
to the typical exponential blow-up of the state set in the number of “parallel” 
components, one often requires abstraction techniques for data and even control 
flow paths in order to bring S down to smaller size [4] . One then needs to make 
certain that a positive model check of (f> for the abstracted system A4' means 
that the original system A4 satisfies the specification (j) as well. 

While this triad of system description, abstraction, and verification formal- 
ism has been well established and successful for qualitative system design and 
analysis, its transfer to quantitative system descriptions has, by and large, been 
problematic. On the conceptual side, in moving from a qualitative to a quantita- 
tive view of a system, one ordinarily has to change the description language, the 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 369—380, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



370 



Michael Huth 



notion of abstraction, and the verification engine completely; for a notable excep- 
tion see J. Hillston’s work in [8]. Such changes not only necessitate the knowledge 
of sophisticated and computationally expensive concepts, such as measure the- 
ory [7] and probabilistic bisimulation [13,1], but also make it hard to embed the 
qualitative description into such a quantitative view, or to re-interpret quanti- 
tative results as qualitative judgments. Ideally, one would like to have a uniform 
family of such triads with view-mediating maps across all three dimensions: de- 
scription, abstraction, and verification. 

While our paper is an initial contribution toward crafting such a family, it also 
proposes the development of such a model checking framework for loosely spec- 
ifying and verifying qualitative and quantitative systems. Systems often cannot 
be described in complete detail and usually we would like to give an implementor 
more flexibility in how to realize a specified system. Note that these comments 
apply to qualitative systems, such as concurrency protocols, as well as to quan- 
titative ones, like loose Markov chains, where the actual state-transition prob- 
abilities may only be known to be within some interval. Our work extends and 
builds upon the work on modal transition systems by K. G. Larsen and B. Thom- 
sen [14,12], and probabilistic specifications [10] by B. Jonsson and K. G. Larsen. 
Both approaches have in common that transitions s—^°‘ s' are loosely specified, 
meaning that the system description does not determine the actual implementa- 
tion fully. For modal transition systems, transitions s s' are either guaranteed 
(s sO I oi' possible {s^% s') [14] . For probabilistic specifications, we have tran- 
sitions of the form s s', where 7^ is a “set of probabilities” [10] which we will 
always assume to be a closed interval [a:, y] with 0 < a; and y < 1. Gonceptually, 
the latter models are interesting because they do not commit themselves to being 
“reactive” , “generative” , or even “probabilistic” right away. Such interpretations 
enter via the chosen notion of refinement, where “total” refinements, our imple- 
mentations, exhibit such desired properties. In this paper, we study a modal and 
a probabilistic interpretation of such models. In the context of model checking 
loosely specified systems, “safety” now means that the information computed 
for a property (f is a, consistent and valid approximation of the information that 
we could compute for any possible refinement, or implementation. 

In the next section, we present three different views of a system along with 
their corresponding refinement notions. In Section 3, each such view determines 
a semantics of the modal mu-calculus which we prove to be sound with respect 
to refinement. Section 4 discussed abstractions and concretizations across sys- 
tem views and shows how model checking results transfer across such views. In 
Section 5, we hint at a process algebra framework which accommodates viewing 
systems at three levels and prove some compositionality results of process alge- 
bra operators with respect to refinement. Section 6 briefly covers a probabilistic 
view, giving rise to loose Markov chains. Finally, Section 7 provides an outlook 
on future work. 



A Unifying Framework for Model Checking 371 



2 Different Views of a System 

To illustrate, we consider the model of an unreliable medium in Figure 1. Clearly, 
the fully specified qualitative system, (a), is of little use as only fiawed media 
(with an error state) are allowed refinements up to bisimulation. The qualitative, 
but loosely specified system, (b), is already faring better, since an implementa- 
tion may now choose not to realize the transition from state full to state error. 
A possible refinement would therefore be the ideal and always reliable medium 
obtained from (a) by removing state error and all its incoming and outgoing 
transitions. The fully specified quantitative system, (c), prescribes even more re- 
alistic behavior by giving probabilities for correct system behavior. This Markov 
chain may be analyzed further, e.g. to determine its steady-state probability dis- 
tribution. A loosely specified Markov chain, (d), however, allows more freedom 
in that we only specify a range for actual state-transition probabilities. This 
requires a generalization of existing techniques for analyzing Markov chains. 




(a) qualitative, fully specified 




(b) qualitative, loosely specified 




(c) quantitative, fully specified 




Fig. 1. Modeling an unreliable medium [10]. 



Three views of models. In general, we propose three views of models M = 
(S', R, L) with S a set of states, R: Sx Act x S — *■ H the state-transition function, 
and L: SxkP ^ D the state-labeling function, where Act is a set of action labels, 
AP a set of atomic state predicates, and D is one of the three domains of view, the 
type of the model A4. We write A4l>D to indicate this relationship. If D equals K, 
the two-element lattice { ff < tt } , then such models are known as labeled Kripke 
structures (Figure 1(a); we omitted the action labels since Act is a singleton set); 
such structures are also known as Doubly Labeled Transition Systems (L^TS) in 



372 



Michael Huth 



the literature [6]. If D equals the three-element poset {dk, which has dk 

(dk for “don’t know”) as a least element, and all other elements are maximal; 
then models are essentially the modal transition systems of [14] (Figure 1(b); 
□ is interpreted as {tt} and O as {dk, tt}). Finally, if D equals the interval 
domain I [17,21], the collection of all closed intervals [x^y] with Q < x < y < 1, 
ordered under reverse containment: [u, v] < [x , y] iS u < x < y < v; then models 
are interval transition systems, a special case of the probabilistic specifications 
in [10] (Figure 1(d)). Note that the Markov chain in Figure 1(c) can be seen as 
a maximal interval transition system as all its behavior is fully specified with 
respect to the information ordering on I, by identifying any r S [0,1] with 
[r, r] S I . It is helpful and insightful to also interpret □ and O on the domains K 
and I. On K, both modalities are identified with the set {tt}; all possible behavior 
is also guaranteed. On I, we write □[x, y] iff a; > 0, and 0[a;, y] iff ?/ > 0: a: stands 
for the guarantee and y for the possibility of a transition R{s, a, s') = [a;, y] . Later 
on, we interpret negation on D and □ will not be the dual of O, unless D equals K. 
Abstractions within system views. It is straightforward to define the sum 
A4 + A4' of two systems of type D. Therefore, we may reduce the concept 
of “system M abstracts system A4'” to “state t abstracts state s in system 
A4 -I- At'” which, in turn, may be reduced to “state s refines state t in system 
At + At'”. The intuitive meaning of “s refines t” is that possible transitions out 
of s are matched with possible transitions out of t, and guaranteed transitions 
out of t are matched with guaranteed transitions out of s [14]; no conditions 
are imposed on guaranteed transitions out of s, or possible transitions out of t. 
Further, a notion of refinement should be co-inductive, monotone with respect 
to the information ordering on D, uniform in the choice of D, and should allow 
for some compositional reasoning. 

Definition 1. For A4 = (S', R, L) \> D, we define a functional Fjj : V{S x S) — > 
V{S X S): given Q C S x S , we set (s,t) G Fd{Q) iff 

1. For all a € Act, and all s' € S, i/ Oi?(s, a, s'), then there is some t' G S 
such that {s',t') G Q, OR{t,a,t'), and R{t,a,t') < R{s,a,s') in D. 

2. For all a G Act, and all t' G S, if OR(t, a, t'), then there is some s' G S such 
that {s',t') G Q, G\R[s,a, s'), and R{t,a,t') < R{s,a,s') in D. 

3. For all p G AP, we have L{t,p) < L{s,p) in D. 

Subsets Q C S satisfying Q C Fjj(Q) are called D-refinements . 

It is easily seen that these functions Fo are monotone, so they have a great- 
est fixed point which is also the greatest D-refinement. One may readily 
show that D-refinements are closed under all unions and relational composi- 
tion. For D being K, K-refinements are simply Milner’s bisimulations [16] for 
all event-based models (= trivial labeling function). To illustrate, consider the 
model in Figure 1(d). If we annotate all transitions of Figure 1(a) with [1,1] 
and remove the state error, then the resulting system is an I-refinement of the 
system in (d). Similarly, if we write [p,p\ for each state-transition probability p 
of the model in Figure 1(c), then this renders another I-refinement of the model 



A Unifying Framework for Model Checking 373 



in (d). Our M-refinements differ from Larsen’s and Thomsen’s refinement notion 
for modal transition systems in [14] in that they match an i?(s, a, s') = dk with 
some R{t, a, t') such that a, t') < tt, whereas we insist on a monotone match 
R{t,a,t') = dk, since R{t,a,t') < R{s,a,s') is enforced. This is a sharper con- 
straint: a possible, but not guaranteed, transition out of the refining state s has 
to be matched with a possible, but not guaranteed, transition out of the refined 
state t. While our notion is suited for unifying it with a refinement for inter- 
val transition systems, Larsen’s and Thomsen’s refinement is not only sound, 
but also complete [12], for equivalences based on the fragment of the modal 
mu-calculus without fixed-points, covered in the next section. 

3 Three Semantics of Temporal Logic 

We use the modal mu-calculus [11] as our logic for specifying system properties; 
its syntax is given by (j) ••= false | p | Z | -^(f> | A ^2 | (o) \ [®] 4* I 

where p ranges over AP, Z over a set of variables, a G Act, and the bodies </> in 
pZ.(/) are formally monotone. We write true and V for the corresponding derived 
operators. Without fixed points and variables, its semantics for modal transition 
systems is implicitly given in [12] for a modal interpretation of Hennessey-Milner 
logic to obtain a characterization of their notion of refinement. There are sev- 
eral “natural” semantics for models of type I; a probabilistic interpretation is 
sketched in Section 6. A quantitative modal semantics is developed below. We 
define all these semantics uniformly and only point out the salient differences. 
Given a model A4 = {S,R,L) l> D, the meaning is generally a function of 
type Env^i S ^ D, where Env^ is the space of all functions (environments p) 
which map variables Z to elements in D. For the remainder of this paper, we as- 
sume that all models M. = {S, R,L)l>D are image-finite: {s' G S' | OR{s, a, s')} 
is finite for all s G S and a G Act. 

Semantics for K and M. The interpretation of propositional logic over K is 
the usual one. For M, we extend this interpretation by ^dk = dk, dk A ff = ff , 
dk A a: = dk if a; yf ff , and V is the deMorgan dual of A via Then |[a] s = 

I OS(s,o,s')| and |(a) fij^ps = \J s' \ □S(s,a,s')|, where A 
is the interpretation of the nary A and V that of nary V on D = K or M. Note 
that this semantics is conservative with respect to refinement as the univer- 
sal modality [a] quantifies over all possible transitions, whereas the existential 
modality (a) only ranges over guaranteed system moves. Except for ^ on K, 
all operations have continuous meaning with respect to the Scott-topology on 
S ^ D (pointwise ordering). Thus, we may define the meaning of pZ.fi as a 
least fixed point (only for K do we require that be formally monotone) . The se- 
mantics for K is the usual one for labeled Kripke structures, since □ and O agree 
on K. The semantics for M, without fixed points, is essentially the one in [12]; 
note that has a different interpretation than [a] on M. We get safety of 

model checks with respect to D-refinement. 

Theorem 1. Let A4l> D be a model of type K, or M, with state set S and s Qd t 
in S. Then holds for all (j> and p. 



374 



Michael Huth 



Thus, model checking an abstraction will give us sound results for the model 
check of any of its refinements, including an actual implementation. 

A modal view of I. Interval transition systems exhibit two, almost orthogonal, 
dimensions of non-determinism: first, which element in {s' S S \ Oi?(s,a, s')} 
will be chosen for execution; second, which r in the interval i?(s,a, s') may an 
implementation realize for the transition s — s'? These notions overlap precisely 
when i?(s, a, s') equals [0, y] with y > 0, for then s' may, or may not, be in the 
set of actual a-successors of s in the implemented system. This subtlety has to 
be reflected in any semantics |</)]^. We define a modal semantics |^]^ps, which is 
an interval [a;,y], such that x is the greatest lower bound guarantee that s \=p 4> 
holds, whereas y is the least upper bound possibility thereof; note that s \=p 4> is 
a shorthand for s G where we turn a system of type I into one of type K, 

as explained in Section 4. This intuition determines the semantics uniquely up 
to an interpretation of set-theoretic conjunction over I, needed in the set qualifi- 
cations for [a] and (a) . Our semantics therefore depends implicitly on a t-norm 
T-. [0,1] X [0,1] — > [0,1], a Scott-continuous map (= preserving all directed 
suprema) which interprets conjunction and makes ([0, 1],T, 1) into a commuta- 
tive monoid. The interpretation of ^ is -^[x,y\ = [1 — y, 1 — a;]. The meaning 
of ^ may be justified with the prescriptive intuition given above. We illustrate 
such reasoning for the interpretation of A: if = [a;,y] and |'0]^ps = [u,u], 

then [x,y] A [u,u] ought to be [min(a;, m), min(y, u)]. We only justify the choice 
of min(a;, u) as the other case is argued similarly: a; is a guarantee for s \=p (j) 
and u is a guarantee for s \=p thus, we have at least the guarantee min(x, u) 
in either case. Using the proof rule A-introduction, we obtain that min(a;, u) is a 
guarantee for s ]=p 4> A ijj. If a is another such guarantee, we may use the proof 
rules A-elimination twice to conclude that a is also a guarantee for s \=p (j) and 
s \=p 'll). But then a < x and a < u follows as x and u are least upper bounds on 
the guarantees of these respective properties. 

Now we define the modal quantitative semantics of (a) and [a] . In the se- 
quel, we write pr^^ : I ^ [0, 1] for the function [x, y] i— > a; and pr 2 : I — *■ [0, 1] for 
[a;,y] i— > y. According to our primary semantic guideline, we set |(a)^]^ps = 
[x,y], where x = V[o,i]{^(pi'i^('S, a, s'). PfiM V s') | □i?(s,a,s')j and y = 
V[o i]i^(P'' 2 -^(Si s'), pr 2 |^i]^p s') I Oi?(s, a, s')}. Since the pi' 2 i?(s, a, s') are 

all least upper bound possibilities for s^“ s', since pr 2 |</)]^ps' is assumed to be 
the least upper bound possibility for s' \=p 4> to hold, the least upper bound for 
the possibility of s \=p (a) ^ is a maximal r(pi' 2 i?(s, a, s'), pr 2 |^i]^ps'), where the 
transition to s' is a possible one. A similar justification for x being the greatest 
lower bound guarantee for s \=p (a) 4> can be given. Although the qualifications □ 
and O are not really needed for computing the meaning of (a) , they reveal a du- 
ality between (a) and [a] . We obtain the meaning of [a] 4> from the one of (a) 4> by 
swapping all occurrences of □ and O, note that □ and O are then no longer redun- 
dant, and by replacing all occurrences of \/jp with /\jp This reflects the fact 
that we now have to reason about bounds for all possible next states. If we write 
[u,v] for |[a](/)]Vs, then u = A[o,i]{7’(pi'i-R(s, s')>PriMVs') I Oi?(s,a,s')} 



A Unifying Framework for Model Checking 375 



and V = V[o i]{^(pi'2-R('5:a,s0)Pi'2[^FP'S0 I ^R{s,a,s')}. The justification of 
this semantics is dual to the one of (a) in the sense of “duality” explained above. 

The justification for the least fixed-point semantics of is that we 

begin to unfold the recursive meaning with initial value [0, 1], the bottom of I, at 
each state: initially, no guarantees, but all possibilities are given. The process of 
unfolding increases evidence for the guarantee of s \=p fiZ.ij), whereas it decreases 
its possibility. If, and when this process stabilizes, we have established the best 
evidence we could find for s \=p fxZ.fjj without knowing the particular implemen- 
tation. Since M and I are not complete lattices, but are domains, we have to, and 
can, define the meaning of greatest fixed points {irZ.cj)}^ as \-^fiZ.^(j)[-^Z /Z]}^ , 
where (j)\-^Z j Z\ is the result of replacing all free occurrences oi Z va. (j) with ^Z . 

Theorem 2. For the denotational semantics |^]^, all its operations are Scott- 
continuous. In particular, the approximation of fixed points reaches its meaning 
at level ui. Moreover, if Mi \> 1 has state set S with s Qi t in S, then < 

|(/)]^ps holds for all (j) and p. 

This semantics is also continuous in the sense that the meaning will 
depend continuously on small changes made to R and L in an underlying model 
M = {S,R,L) \> I. 



4 Abstractions Across Views. 

With a semantics of temporal logic formulas for each system view at hand, 
we need to understand whether and how such meanings transfer if we change 
the view of a system under consideration. We give such an account for moving 
between K and M, and M and I, respectively. 

Abstraction and safety between K and M. The change of view between 
models of type K and M is different in quality from a move between models of type 
M and I, for ^ is not monotone on models of type K. Moreover, the embedding 
z: K ^ M, with i{x) = x for all x € K, is not monotone as well. It induces 
embeddings of models Mi = (5, R, L) > K by setting iMi = {S,i o R,i o L) t> H. 
Conversely, M — > K are both monotone maps 7 with 7 o z = idK and 

z o 7 > idu; among those maps a* and af are uniquely defined by o;*(dk) = ff 
and a*‘'(dk) = tt, respectively. We set a^^Mt' = (S,al'^ o o L') [> K for 

any Mi' = (S,R',L') [> M. The map a* translates “truth values” pertaining to 
“propositional” information (model checks and labeling functions) and behaves 
well with respect to propositional logic: a* o ^ < -1 o a*, a* o A = A o a* x a*, 
and a* o V = Voa* X a*. Since off gives us a conservative account of transitions, 
and since a* preserves all suprema as a lower adjoint of a, which differs from z 
in that it sends ff to dk, we may relate model checks on AI' [> M to those on 
a^cMi' [> K, and, similarly, we may compare such results computed for Ad [> K and 
iMi > M. 

Theorem 3. For all formulas (j) of the modal mu-calculus, we have the inequality 
p' < o p' and the equality i\(j>\-'^'^^ p = |(/)]*-^'>^z o p for 

all models and environments of the required types. 




376 



Michael Huth 



For _D = K or M, we write s |=^ 4>, if p s = tt; and s 4>, if 

l(j)]^'^^ps = ff, where 7W is a model of type D and p G Envij. Given any 
At' = (S', i?', L') \> M with s (f), we may use the previous theorem to infer 
that tt = ^^ps < |(/)]“*^ ops; but since tt is a maximal element 

in K, this implies |<(']“*^ o ps = tt. Thus, a \=^ 4> implies a |=“*^ </> for 
all formulaa of the modal mu-calculus, where the latter is the standard notion 
of satisfaction for labeled Kripke structures. However, such an inference cannot 
be made for the negative version (at the meta-level): if a <j), then both 

parts of our theorem provide no additional information in general. In this case, 
the inequality in this theorem is redundant as the left hand side, ^^p' a, 

denotes the least element of K; the theorem’s equality is of no use as well, since 
io a*, is not equal to, but above, the identity id^. These results are quite similar 
in structure to the ones obtained in [5] and it would be of interest to establish 
connections to the work in loc. cit. 

Abstraction and safety between M and I. We embed M into I via f3 such that 
/3(dk) = [0, 1], /3(ff) = [0, 0] and /3(tt) = [1, 1]. Note that this map is monotone 
and matches our semantic intuition of [x, y] giving guarantees and possibilities 
of truth. We re-translate such truth-value intervals with the upper adjoint /3* 
which is uniquely determined by being monotone and satisfying (3* o (3 = id^ and 
/3 o /3* < idj. As for the values of transitions, we define a map I ^ M which 
is uniquely determined by preserving the three predicates □, O A and ^O, 
which single out all elements of M. Note that this map also reflects □ and O from 
M back to I. One can readily see that [3* is a homomorphism for A, and V; 
e.g. /3* o -1 = -1 o /3*. As for the modalities and fixed points, we make crucial use 
of the fact that (3* is the upper adjoint of f3. For M\>1 and M' \> M, we define 
P*M = (S, Pf, oR,p*oL)t>M and pM' = (5*, /3 o i?', poL')>l. 

Theorem 4. Let p be any formula of the modal mu-calculus and consider models 
M \> 1 and M! l> M. Then P*\pY^^'^p < |</)]^ Mt>np* ^ ^ |</>]^^ '^^p < 

^^P o p hold for all t-norms T such that T(a,b) = 0 implies a = 0 or 

6 = 0 . 

The proof of this theorem reveals that the condition on the t-norm is nec- 
essary and LAND(a, 6) = max(a -I- 6 — 1, 0) is an example of a Scott-continuous 
t-norm that does not satisfy it; take a and 6 to be 0.5. We also require that 
T(a,b) = 1 imply a = 6 = 1, but this holds for all t-norms, since min is known 
to be the greatest t-norm in the pointwise ordering: T(a, 6) < min(a, 6) holds for 
all o, 6 G [0, 1] and all t-norms T. The first inequality in the theorem above states 
that if a model check of type I results in a “truth value” [0, 0] or [1,1], then that 
value is also the result of the same model check on the more concrete system of 
type M. One may now combine Theorems 3 and 4 to link model-checking results 
between types K and I. 



A Unifying Framework for Model Checking 377 



5 Three Views of Description Languages 

We choose process algebras as system description languages which are parametric 
in the domain of view D, and whose structural operational semantics can be seen 
as an abstract interpretation, based on D, of the concrete operational semantics 
for K. For sake of brevity, we only consider a fragment of Milner’s CCS [16], 
given by the syntax p ::= nil | aa-P \ P + P \ p\\p \ p\B \ x \ fixx.p, where 
d G D, a G Act, x ranges over a set of process variables, and B C Act. Note 
that the only non-standard feature is the annotation of the standard prefix, a.p, 
with a domain element. We also assume the usual involution a i— > a: Act ^ 
Act for communication with the self-involutive symbol r ^ Act for internal, 
non-observable actions. In Figure 2, the abstract interpretations and Par^ 
may well depend on the semantic interpretation one has in mind; e.g. whether 
one considers a modal or probabilistic semantics for D = I. To wit, we define 
Par-^Kd®, d^) \ a G Act} = ^2 I ® ^ Act} for H = K or M. For £) = I, 

we choose a modal interpretation, setting pr 2 Par'°{(dJ, d^) I ® G Act} = 0 if 
there is no a G Act with Od“ and Od^; otherwise, we define pi' 2 Par^{(dJ, d^) | 
a G Act} = max{min(pr 2 dj, pi' 2 d 2 ) | Odf and Od^} and pr^Par-^KdJ, d^) | a G 
Act} = min{min(pr]^d“, pi'j^d^) | DdJ and Dd^}. Thus, this semantics computes 
the worst-case, respectively, best-case evidence for observing an internal r-move. 
The modalities are placed as in the semantics of [a] and we could have used any 
Scott-continuous t-norm instead of the binary min operator, as long as □ and 
O distribute over it. The rule for recursion indicates that these interpretations 
have to be continuous. Note that each process term, p, determines a model of 
type B: if there is a judgment h R{p, a,p') = d, then d is the value of R{p, a,p')', 
otherwise, we set it to be |false]^ps. Note that a^.p is bisimilar to nil for D 
being K. 

Theorem 5. Let Par^ be defined as above. For D = K, the “abstraet interpre- 
tation” in Figure 2 matehes the struetural operational semantics of the corre- 
sponding fragment of CCS in [16]. For D = V[, the abstract interpretation in the 
same figure matches the semantics of the modal process logic for the correspond- 
ing fragment in [If], where we identify ao.p and an.p from [If] with Odk-P and 
a^.p, respectively. 

We are not aware of process algebras based on intervals in the literature, so we 
cannot compare our abstract interpretation for D = 1. Since each process term p 
of type D determines a model of type D, we can write p Qjj q ii p H-refines q 
in the system formed by the sum of p and q. We can prove that refinements are 
compositional for some of the process algebra operators. 

Theorem 6. For all a G Act, d G D, and closed process algebra terms p Cjj q, 
Pi Ei> q% (i = 1, 2) we have Od.p Ed Od.q, Pi\\p 2 Ed qi\\q 2 , and p\B Ed q\B. 

This result can be extended to recursion with a machinery very similar to 
the one employed for D = K in [16]. 



378 



Michael Huth 



h R{ad-P, a,p) = d 

h R{pi,a,p'i) = rii 
I- -R(pi||P2,a,pi||p2) = di 



h R{pi,a,p') = di and h R{p 2 ,a,p') = d 2 
h R{pi +p 2 ,a,p') = di d 2 

h R(p2,a,P2) = d2 
I- R{pi\\P2,a,pi\\p'2) = d2 



h R{p,a,p') = da and a£ B 
h = da 



Res 

h R{pi,a,p'i) = di and h R{p 2 ,d,P 2 ) = dg 
I- R{Pi\\P 2 ,T,p[\\p' 2 ) = Par°{(df,df) | a G Act} 



Com_3 



h -R(p[fixa:.p/a:], a,p') — d 

i tTT ^ 7 

h H[f±xx.p, a,p') = a 

Fig. 2. Abstract interpretation of a structural operational semantics for our three 
process algebras 



6 Loose Markov Chains 

Knowing the class of implementations may well allow the customization of our 
framework to such a class. For example, if interval transition systems are to spec- 
ify labeled Markov chains, then we can restrict our attention to certain models 
of type I. A labeled Markov chain (S,P,L) satisfies = 1 for all 

a G Act and s G S. We may approximate such a model with the same set of 
states by M = (S', R,L)\>1 such that, for all a G Act and s G S, we have 
pi']^ i?(s, a, s') < 1 and pi' 2 i?(s, a, s') < 1 — X)s" 5 «^s' 
inequality says that the lower bounds for the actual state-transition probabilities 
form a subprobability distribution; the sum of probabilistic guarantees must not 
exceed 1. The second inequality is a consistency condition, saying that the up- 
per bound on the possible probability of s s' cannot be greater than 1 minus 
the sum of all lower bound guarantees on probabilities of moves to any other 
successor state of s. We call such models loose Markov chains. The models in 
Figure 1(c) and (d) are such examples and (c) is an I-refinement of (d). It would 
be of interest to define a probabilistic refinement which coincides with proba- 
bilistic bisimulation [13] for maximal models (Markov chains) and to compare 
such a notion with the work of [10]. As for a semantics of formulas 4>, we change 
the modal semantics by re-interpreting A, (a) , and [a] . For A, we may either use 
a safe t-norm, as done in [18,9,2], or develop a measure theory of measures of 
type pL'. S{X) I, where the conventional measures of type p: S{X) — s- [0, 1] 
form the maximal elements of that space. As for the modalities, we identify the 
meaning of (a) and [a] and set pr^Ko) ipj^ps = prj^ i?(s, a, s') • pr]^|())]''ps' 

and pr 2 |(a) 4>Y P s = min(l, P^' 2 ^(®i ■sO ’P'' 2 ['('Fp note that • is a Scott- 
continuous t-norm. 



A Unifying Framework for Model Checking 379 



Theorem 7. Let Af > I be a loose Markov chain with state set S and s t 
in S. Then holds for all <f> and p and all monotone interpreta- 

tions of A; in particular, this holds when A is interpreted as a probabilistically 
conservative t-norm. 

7 Outlook 

The design and analysis of algorithms for deciding D-refinements needs to be 
done particularly for the case D = I . It would be of interest to obtain an inde- 
pendent logical characterization of these refinements. Interval transition systems 
should be evaluated toward their suitability of describing systems with uncer- 
tainty, or vagueness. A guarded-command language for the description of such 
models may provide a foundation for the formal analysis of fuzzy interval infer- 
ence systems. Connections to Bayesian networks and Dempster-Shafer theories 
of evidence need to be explored. The models of loose Markov chains require 
a customized description language; their ergodic analysis should reduce to an 
optimization problem. The computation of conditional probabilities, however, 
may require a “domain theory” for probability measures, where the latter are 
maximal elements in a space of “measures” of range I. This needs to be a con- 
servative extension in the sense that the “probability axioms” for the I-valued 
measures will reduce to the familiar axioms in case that the measure is maximal. 
Such work may well transfer to the generalized probabilistic logic (GPL) designed 
by N. Narashima, R. Cleaveland and P. Iyer in [19]. Loose Markov chains will 
also benefit from a probabilistic version of I-refinement which should coincide 
with a familiar probabilistic bisimulation for “maximal” models. Martin Escardo 
pointed out that our framework is extendible to cover infinite-state systems as 
well. One obtains a continuous, and computable, semantics of model checks, pro- 
vided that the sets of possible (O) and guaranteed (□) a-successors of state s 
are compact for all a G Act and s G S, where S' is a compact Hausdorff space. 



Acknowledgments 

A number of people have made valuable suggestions during visits, or talks, 
given at their institutes. Among them were Martin Escardo, Stephen Gilmore, 
Jane Hillston, Marta Kwiatkowska, Annabelle Mclver, Garroll Morgan, and Jeff 
Sanders. 



References 

1. C. Baier. Polynomial Time Algorithms for Testing Probabilistic Bisimulation and 
Simulation. In Proceedings of CAV’96, number 1102 in Lecture Notes in Computer 
Science, pages 38-49. Springer Verlag, 1996. 370 

2. C. Baier, M. Kwiatkowska, and G. Norman. Computing probability bounds for 
linear time formulas over concurrent probabilistic systems. Electronic Notes in 
Theoretical Computer Science, 21:19 pages, 1999. 378 



380 



Michael Huth 



3. E. M. Clarke and E. M. Emerson. Synthesis of synchronization skeletons for branch- 
ing time temporal logic. In D. Kozen, editor, Proc. Logic of Programs, volume 131 
of LNCS. Springer Verlag, 1981. 369 

4. E. M. Clarke, O. Grumberg, and D. E. Long. Model Checking and Abstraction. In 
19th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming 
Languages, pages 343-354. ACM Press, 1992. 369 

5. Dennis Dams, Rob Gerth, and Orna Grumberg. Abstract interpretation of reactive 
systems. ACM Transactions on Programming Languages and Systems, 19(2), 1997. 
376 

6. R. de Nicola and F. Vaandrager. Three Logics for Branching Bisimulation. Journal 
of the Association of Computing Machinery, 42(2):458-487, March 1995. 372 

7. P. R. Halmos. Measure Theory. D. van Norstrand Company, 1950. 370 

8. J. Hillston. A Compositional Approach to Performance Modelling. Cambridge 
University Press. Distinguished Dissertation Series, 1996. 370 

9. M. Huth. The Interval Domain: A Matchmaker for aCTL and aPCTL. In M. Mis- 
love, editor, 2nd US-Brazil joint workshop on the Formal Foundations of Software 
Systems held at Tulane University, New Orleans, Louisiana, November 13-16, 1997, 
volume 14 of Electronic Notes in Theoretical Computer Science. Elsevier, 1999. 378 

10. B. Jonsson and K. G. Larsen. Specification and Refinement of Probabilistic Pro- 
cesses. In Proceedings of the International Symposium on Logic in Computer Sci- 
ence, pages 266-277. IEEE Gomputer Society, IEEE Gomputer Society Press, July 
1991. 370, 370, 371, 372, 378 

11. D. Kozen. Results on the propositional mu-calculus. Theoretical Computer Science, 
27:333-354, 1983. 373 

12. K. G. Larsen. Modal Specifications. In J. Sifakis, editor. Automatic Verifica- 
tion Methods for Finite State Systems, number 407 in Lecture Notes in Computer 
Science, pages 232-246. Springer Verlag, June 12-14, 1989 1989. International 
Workshop, Grenoble, France. 370, 373, 373, 373 

13. K. G. Larsen and A. Skou. Bisimulation through Probabilistic Testing. Information 
and Computation, 94(l):l-28, September 1991. 370, 378 

14. K. G. Larsen and B. Thomsen. A Modal Process Logic. In Third Annual Sympo- 
sium on Logic in Computer Science, pages 203-210. IEEE Gomputer Society Press, 
1988. 370, 370, 372, 372, 373, 377, 377 

15. K. L. McMillan. Symbolic Model Checking. Kluwer Academic Publishers, 1993. 
369 

16. R. Milner. Communication and Concurrency. Series in Gomputer Science. Prentice- 
Hall International, 1989. 369, 372, 377, 377, 377 

17. R. E. Moore. Interval Analysis. Prentice-Hall, Englewood Gliffs, 1966. 372 

18. G. Morgan, A. Mclver, and K. Seidel. Probabilistic predicate transformers. ACM 
Transactions on Programming Languages and Systems, 18(3):325-353, May 1996. 
378 

19. M. Narashima, R. Gleaveland, and P. Iyer. Probabilistic Temporal Logics via the 
Modal Mu-Calculus. In W. Thomas, editor. Foundations of Software Science and 
Computation Structures, volume 1578 of Lecture Notes in Computer Science, pages 
288-305. Springer Verlag, March 1999. 379 

20. J. P. Quielle and J. Sifakis. Specification and verification of concurrent systems in 
cesar. In Proceedings of the fifth International Symposium on Programming, 1981. 
369 

21. D. S. Scott. Lattice Theory, Data Types and Semantics. In Formal Semantics of 
Programming Languages, pages 66-106. Prentice-Hall, 1972. 372 



Graded Modalities and Resource Bisimulation 



Flavio Corradini^, Rocco De Nicola^, and Anna Labella^ 

^ Dipartimento di Matematica Pura ed Applicata, Universita dell’Aquila 

f lavioOunivaq. it 

^ Dipartimento di Sistemi e Informatica, Universita di Firenze 
denicolaOdsi .unif i . it 

® Dipartimento di Scienze dell’Informazione, Universita di Roma “La Sapienza” 
labellaSdsi . uniromal . it 



Abstract. The logical characterization of the strong and the weak (ig- 
noring silent actions) versions of resource bisimulation are studied. The 
temporal logics we introduce are variants of Hennessy-Milner Logics that 
use graded modalities instead of the classical box and diamond operators. 

The considered strong bisimulation induces an equivalence that, when 
applied to labelled transition systems, permits identifying all and only 
those systems that give rise to isomorphic unfoldings. Strong resource 
bisimulation has been used to provide nondeterministic interpretation 
of finite regular expressions and new axiomatizations for them. Here we 
generalize this result to its weak variant. 

1 Introduction 

Modal and temporal logics have been proved useful formalisms for specifying 
and verifying properties of concurrent systems (see e.g. [19]), and different tools 
have been developed to support such activities [8,7]. However, to date, there is 
no general agreement on the type of logic to be used. Since a logic naturally 
gives rise to equivalences (two systems are equivalent if they satisfy the same 
formulae) often, for a better understanding and evaluation, the proposed logics 
have been contrasted with behavioural equivalences. The interested reader is 
referred to [10,17] for comparative presentations of many such equivalences. 

Establishing a direct correspondence between a logic and a behavioural equiv- 
alence provides additional confidence in both approaches. A well-known result 
relating behavioural and logical semantics is that reported in [15]; there, a modal 
logic, now known as Hennessy-Milner Logic HML, is defined which, when inter- 
preted over (arc-) labelled transition systems with and without silent actions, 
is proved to be in full agreement with two equivalences called strong and weak 
observational equivalence. Other correspondences have been established in [2] 
where two equivalences over Kripke structures (node-labelled transition systems) 
are related to two variants of CTL* [13], and in [12] where three different logical 
characterizations are provided for another variant of bisimulation called branch- 
ing bisimulation. 

In this paper, we study the logical characterization of yet another variant 
of bisimulation that we call resource bisimulation [6]. This bisimulation takes 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 381—393, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



382 



Flavio Corradini et al. 



into account the number of choices a system has, even after it has decided the 
specific action to be performed. The new equivalence counts the instances of 
specific actions a system may perform and thus considers as different the two 
terms P and P + P; the latter representing the non-deterministic composition 
of a system with itself. Intuitively, this can be motivated by saying that P + P 
duplicates the resources available in P. This permits differentiating systems also 
relatively to a form of fault tolerance known as “cold redundancy” : P + P is 
more tolerant to faults than P, because it can take advantage of the different 
instances of the available resources. 

Resource bisimulation enjoys several nice properties (we refer to [6] for a 
comprehensive account and for additional motivations). It has been shown that 
resource bisimulation coincides with the kernel of resource simulation (this re- 
sult is new for simulation-like semantics) . Moreover, it permits identifying all and 
only those labelled transition systems that give rise to isomorphic unfoldings. 
Also, resource bisimulation, when used to provide nondeterministic interpreta- 
tion of finite regular expressions, leads to a behavioural semantics that is in full 
agreement with a tree-based denotational semantics and is characterized via a 
small set of axioms obtained from Salomaa’s axiomatization of regular expres- 
sions [21] by removing the axioms stating idempotence of -I- and distributivity 
of • over -k, see Table 1. 



Table 1. Axioms for resource bisimulation over finite regular expressions. 



X + Y = Y + X 


(Cl) 


{X + Y) Y Z = X + {Y + Z) 


(C2) 


X + Q = X 


(C3) 


(X-Y)>Z = X-(Y-Z) 


(SI) 


A.l = A 


(S2) 


f.A = A 


(S3) 


A.O = 0 


(S4) 


O.A = 0 


(S5) 


(A + Y)>Z = (X-Z) + (Y>Z) 


(RD) 



In this paper we continue our investigation on resurce bisimulation in two 
directions. First, we study a logical characterization of resource bisimulation, 
then, we provide a sound and complete axiomatization for its weak variant. 
The logic which characterizes resource bisimulation is obtained by replacing 
both the box and the diamond modalities of HML with the family of graded 
modalities [14], defined below, where # denotes multisets cardinality. 



P 1= {p)n4’ if E^nd only if ^{\p' \ p p' and p' [= '(/>[} = n. 



Graded Modalities and Resource Bisimulation 



383 



If we define Graded HML {GHML) to be the set of formulae generated by the 
grammar: 



ip ::= True | tpi Atp 2 | {fAnfp where n G A and 0 < n < cx) 
it can be established that 

(itp G GHML, P \= Ip <1=^ Q 1= '0) if and only if P Q 

We shall also study the weak variant of resource bisimulation over regular ex- 
pressions enriched with a distinct invisible r— action, and we shall provide also 
for this new equivalence both an axiomatic and a logical characterization. The 
complete axiomatization will be obtained by adding the axiom 

a’T’X = a’X 

to those for (strong) resource bisimulation of Table 1. The logical characteri- 
zation is obtained by providing a different {weak) interpretation of the modal 
operators described above. 

Due to space limitation, all proofs are omitted, they are reported in the full 
version of the paper. 



2 Nondeterministic Expressions and Resource 
Bisimulation 

In this section we provide an observational account of finite nondeterministic reg- 
ular expressions, by interpreting them as equivalence classes of labelled transition 
systems. This part has been extensively treated in [6]. The proposed equivalence 
relies on the same recursive pattern of bisimulation but takes into account also 
the number of equivalent states that are reachable from a given one. 

Let A U {1} be a set of actions. The set of nondeterministic finite regular 
expressions over A is the set PL of terms generated by the following grammar: 

P ::= o|l|a|P-|-P| P*P where a is in A. 

We give the following interpretation to nondeterministic regular expressions. Like 
in [1], the term 0, denotes the empty process. The term 1 denotes the process 
that does nothing and successfully terminates. The term a denotes a process that 
executes a visible action a and then successfully terminates. The operator -|- can 
be seen as describing the nondeterministic composition of agents. The operator • 
models sequential composition. 

Definition 1. A labelled transition system is a triple < Z, L,T > where Z is a, 
set of states, P is a set of labels and T = { — >C ZxZ\nGL}isa transition 
relation. 



384 



Flavio Corradini et al. 



Table 2. Active predicate. 



active{l) 




active(a) 




active{P) V active{Q) = 


active{P + Q) 


active{P) A active{Q) = 


active{P'Q) 



Table 3. Operational Semantics for PL. 



Tic) 




1 1 






Atom) 














a 1 






Sumi) 


P 


P', active{Q) 


Sum'i) 


P p'^ -nactive{Q) 




p+Q <^> p' 


P + Q P' 


Sum2) 


Q 


Q', active{P) 


Sum2) 


Q Q' ^ -nactive(P) 




P + Q Q' 


P + Q Q' 


SeqJ 


P 


P', active{Q) 


Seqj) 


P 1, Q Q' 




p.Q P'.Q 


P-Q Q' 



In our case, states are terms of PL and labels are pairs < fj,,u > with 
^ G A U {1} and u a word, called choice sequence^ in the free monoid generated 
by {I, r}. The transition relation relies on the “active” predicate defined in 
Table 2 and is defined in Table 3. There, and in the rest of the paper, we write 
z z' instead of < z, z' 

We have two kinds of transitions: 

• P p'- p performs an action a, possibly preceded by 1-actions with 

choice sequence u. 

• P X: P performs 1-actions to reach process 1 with choice sequence u. 

These transitions are atomic, which means that they cannot be interrupted 
and keep no track of intermediate states. In both cases, u is used to keep in- 
formation about the possible nondeterministic structure of P, and will permit 
distinguishing those transitions of P whose action label and target state have the 



Graded Modalities and Resource Bisimulation 



385 



same name but are the result of different choices. Thus for a 

<a,l> 



a, it is possible to 

<a,r> 



record that it can perform two different a actions: a + a — landa + a 
without the I and r labels, we would have only the a + a — > 1 transition. 

The predicate active over PL processes that is used in Seqi allows us to 
detect empty processes and to avoid performing actions leading to deadlocked 
states. 

The rules of Table 3 should be self-explanatory. We only comment on those 
for -I- and •. 

The rule for P-l-Q says that if P can perform < /i, u > to become P', and Q is 
not deadlocked, then P + Q can perform < to become P' where I records 

that action fjL has been performed by the left alternative. If Q is deadlocked, 
then, no track of the choice is kept in the label. The right alternative is dealt 
with symmetrically. 

Seqi) mimics sequential composition of P and Q; it states that if P can 
perform < fx,u > then P’Q can evolve with the same label to P'^Q. The premise 
active{Q) of the inference rule ensures that Q can successfully terminate. 

In order to abstract from choice sequences while keeping information about 
the alternatives a process has for performing a specific action, we introduce a 
new transition relation that associates to every pair < P G PL, fj, G ActU{l} >, 
a multiset M, representing all processes that are target of different < fi,u >- 
transitions from P. The new transition relation is defined as the relation that 
satisfies: 



{ P' I 3u. P 



P'l 



Thus, for example, we have: 



— a + a — 

- (1 -I- l)*(a -I- a) 



1, 1 f because — a -I- a l and — a -I- a 



{ 1, 1, 1, 1 } because 



- (l + l)-(a + a) 1, - (l + l)-(a + a) 1, 

- (1 + l).(a + a) 1 - (1 + l).(a + a) 1. 



We shall now introduce the bimulation-based relations that identifies only 
those systems that have exactly the same behaviour and differ only for their 
syntactic structure. This equivalence relation, called resource bisimulation and 
introduced in [6], relates only those terms whose unfolding, via the operational 
semantics, gives rise to isomorphic labelled trees. The transition relation — 
introduced above, is the basis for defining resource bisimulation. 



Definition 2. (Resource Bisimulation) 

a. A relation ift CPLxPL is a r- bisimulation if for each < P, Q >G 5ft, for each 
fj. G AU {!}: 

i. P M implies Q M' and 3/ injective: M M' , such that VP' G 

M,<P',f(P') >e5ft; 

ii. Q M' implies P M and 3g injective: M' — > M, such that 
VQ' G M', < Q',g{Q') >G 5ft; 



386 



Flavio Corradini et al. 



b. P and Q are r-bisimilar (P Q), if there exists a r-bisimulation 3? contain- 
ing <P,Q>. 

The above definitions should be self explanatory. We just remark that the in- 
jection f : M ^ M' is used to ensure that different (indexed) processes in M are 
simulated by different (indexed) processes in M' Thus r-bisimilarity requires 
the cardinality of M be less or equal to the cardinality of M'. 

Since the multisets we are dealing with are finite, conditions i) and ii) of 
Definition 2 can be summarized as follows: P — ^ M implies Q M' and 
there exists a bijective f : M ^ M' , s.t. for all P' € M, < P' , f{P') >€ 3?. 

With standard techniques it is possible to show that is an equivalence 
relation and it is preserved by nondeterministic composition and sequential 
composition. It is not difficult to check that a a + a, a + b b + a and 
(1 -I- l)*a a + a- 

3 A Logical Characterization of Resource Bisimulation 

In this section, we provide a positive logic for resource bisimulation. In [15], 
a modal logic, now known as Hennessy-Milner Logic (HML), is defined which, 
when interpreted over labelled transition systems with (or without) silent ac- 
tions, is proved to be in full agreement with weak (or strong) observational 
equivalence. 

Our logic is obtained from HML by eliminating the false predicate and by 
replacing both the box and the diamond modalities (or, alternatively, both the 
box and the ^ modality) with a family of so called graded modalities [14] of the 
form where 0 < n < oo. Intuitively, a process P satisfies the formula 

(/i)„<^ if P has exactly n /i-derivatives satisfying formula ip. 

Let Graded HML (GHML) be the set of formulae generated by the following 
grammar: 

if ::= tt I ip A if I {pfn^P where ^ G H U {1} and 0 < n < oo. 

The satisfaction relation ]= for the logic over GHML formulae is given by: 

P \=tt for any P 

P \= ipi A ip 2 iS. P \= ipi and P \= ip 2 

P h iff #({[ p' I 3u. P P' [} n {|P' I P' h ‘^11) = n 

We shall let 3 (?l denote the binary relation over PL processes that satisfy the 
same set of GHML formulae: 

= {(-P, Q) jV:^ e GHML , P^p, ^ Q^p,} 

and will show that 3f?L is a resource bisimulation. 

^ Since a multiset can be seen as a set of indexed elements, an injection between 
multisets can be seen as an ordinary injection between sets. 



Graded Modalities and Resource Bisimulation 



387 



Indeed, we can prove that the equivalence induced by GHML formulae co- 
incides with resource equivalence. The proof that if P Q then, for all ip G 
GHML, it holds that (P |= iff Q |= <p) is standard and follows by induction on 
the syntactic structure of formulae. 

Proposition 1. Let P, Q be PL processes. If P Q then {Vip G GHML , 
P\=ip Q \= ip). 

The proof of the reverse implication, namely, the proof that any two processes 
satisfying the same set of GHML formulae are weak resource equivalent, requires 
a more sophisticated proof technique. It needs to be shown that, if was not 
a weak resource bisimulation, then, there would exist (P, Q) G such that for 
some p G Act, P M implies Q M' and for all bijective fp. M ^ M' , 
there would exist Pi G M such that {Pi, fi{Pi)) ^ This implies that there 
exists a formula ip G GHML such that Pi\= ip but f{Pi) ^ ip. 

We can prove that, given a multiset of processes, we can find a formula 
characterizing each of its bisimulation classes, in the sense that every element 
in a class satisfies the characteristic formula of the class and does not satisfies 
any formula characterizing any other class. In this way, from the hypothesis that 
P Q does not hold, we can obtain a formula satisfied by one of the original 
processes, but not by the other one. 

Given a multiset M and GHML formulae ipi, ip2, ... , ipn, let Mf' be the 
subset of M that satisfies ipp. 

= {|P'gM|P' iG[l..n] 

If , M2^ , ..., M!^"} is a partition of M, then we shall write M = l±) 

Lemma 1. Given a finite multiset M and a partition Mi l±) M2 W ... W M„ of M 
satisfying the property that two elements of the same class satisfy the same 
formulae and two elements in two different classes behave differently for at least 
one formula, then, for every class, there is a formula {the characteristic formula) 
satisfied by all the elements of that class and by none of any other class. Therefore 
we can write M = Mf" l±) l±l ... l±l M^", where £,i,-..,^n are such that each 
P G M^* satisfies fi, (P ^ ^i), while each Q G j yf i, does not satisfy ^i, 

{Q ^ ?i)- 

The coincidence between resource bisimulation and the equivalence induced 
by the GHML formulae immediately follows from the lemma above. 

Proposition 2. Let P, Q be PL processes. 

If (V(p G GHML , P \= ip Q \= ip) then P Q- 




388 



Flavio Corradini et al. 



4 Weak Resource Bisimulation 

This section is devoted to giving expressions in presence of invisible actions. 

Let A U {1} be a set of visible actions and r ^ AU {1} be the invisible action. 
We use ^, 7 , ...,^', 7 ', ... to range over by 4u{l}U{r}, a, (3, ... to range 

over by A U {r} and a, b , ..., a' , b' , ... to range over by A. 

The set of nondeterministic regular expressions over A U {r} is the set of 
terms generated by the following grammar: 

P ::= o|l|a|r|p + p| P’P where a is in A. 

We will refer to the set of terms above as PL as well. We extend the interpretation 
given for the r-less case as follows: r denotes a process which can internally 
evolve and then successfully terminates. For those familiar with the operational 
semantic of process algebras, we would like to remark that 1 -actions do not play 
the same role of invisible r- actions. They simply stand for successful terminated 
processes. 

To deal with the new actions, we extend the transition relation of Table 3 by 
adding the rule: 



Tan) — 

T 



<r,e> 



1 



It relies on the predicate active defined in Table 2 extended with the condition 
below, that is used to detect empty processes 

active{T). 



We have now three kinds of transitions: 

• P p'. p performs an action a, possibly preceded by 1-actions, with 

choice sequence u. 

• P X: P performs 1-actions to reach process 1 with choice sequence u. 

• P p'. p performs an action r, possibly preceded by 1-actions, with 

choice sequence u. 

These transitions are atomic; they cannot be interrupted and, moreover, leave 
no track of intermediate states. In both cases, u is used to keep information about 
the possible nondeterministic structure of P, and will permit distinguishing those 
transitions of P with identical action label and target state. 

Starting from elementary transitions, weak transitions can be defined. They 
can be invisible or visible. Weak invisible transitions denote sequences of t- 
transitions (possibly interleaved by I’s) that lead to branching nodes, while weak 
visible transitions denote the execution of visible actions (possibly) followed or 
preceded by invisible moves. As usual, we have also terminating moves, i.e., se- 
quences of 1-actions leading to successful termination of a process; see Table 4 
for their formal definitions. In order to be able to give full account of the differ- 



Graded Modalities and Resource Bisimulation 



389 



Table 4. Weak Transitions for PL. 




ent alternatives a process has when determining the specific action to perform, 
we introduce a transition relation that associates a multiset M, to every pair 
P e PL, /r e Act U {1} U {r}. M represents all processes that reacheable via 
(initial) weak < >-transitions by P. Since we are interested in the branch- 
ing structure of processes and in detecting their actual choice points, we remove 
from M all those processes which can perform a r actions in a purely determin- 
istic fashion. That is, we remove those target processes which can perform an 
initial r-transition “without choice” . This new transition relation is defined as 
the least relation such that: 

P {\P' \ 3u. P P' 1} - {| P' I P' with v = e\^. 



The transition relation 



is the basis for defining resource equivalence. 



Definition 3. ( Weak Resource Bisimulation) 

1. A relation 3? CPLx PL is a weak resource bisimulation if for each < P, Q 
3fJ, for each ^ G AU {1} U {t}: 

(i) P M implies Q M' and there exists an injective f \ M ^ M', 

such that for all P' G M, < P' ,f{P') >G 3?; 

(ii) Q IVP implies P M and there exists an injective g : M' ^ M, 

such that for all Q' G M' , < g{Q'),Q' >G 3?. 

2. P and Q are weak resource equivalent (P ~r Q), if there exists a weak 
r-bisimulation 3? containing < P,Q >. 



Remark 1. An immediate difference between weak resource equivalence and the 
standard observational equivalence, see e.g. [18], is that we do not consider 
“empty” r- moves, i.e., moves of the form P p- we require at least one r 
to be performed. In our framework, a process cannot idle to match a transition 
of another process which performs invisible actions. 



390 



Flavio Corradini et al. 



Below, we provide a number of examples that should give an idea of the 
discriminating power of weak resource equivalence: 

- Processes r*r and r are related, while r + r and r are taken apart. The 
reason for the latter differentiation is similar to that behind 1 + 1 76 ^ 1 . 
Indeed (r + r)»a is equal to r*a + T*a which has to be different (in this 
counting setting) from t*o. 

- Processes a*r and a are related because the r following action a is not 
relevant from the branching point of view. 

- T'a and a are instead distinguished because the r action preceding the a 
in the former process can influence choices when embedded in a 
non-deterministic context. 

- Processes r and 1 are not equivalent, again because a r action can be ignored 
only after a visible move. 

- Processes (r + l)*a and a + r*a are weak resource bisimilar. 

- Processes t(t + 0) and r*r are weak resource bisimilar 

The following proposition states a congruence result for weak resource bisim- 
ulation. We can prove that our equivalence is actually preserved by all PL op- 
erators; noticeably it is preserved by +. This is another interesting property of 
our equivalence notion; weak equivalences are usually not preserved by + and 
additional work is needed to isolate the coarsest congruence contained in them. 

Proposition 3. Weak resource equivalence is preserved by all PL operators. 

Let us consider now the simulation relation, denoted by +,. and called weak 
resource simulation, obtained by removing item l.(ii) from Definition 3. It can be 
can shown that (like for strong resource bisimulation) the kernel of +,., coincide 
with weak resource equivalence. 

Proposition 4. Let <r be the preorder obtained by considering one of the two 
items in the definition of and let P and Q, be two processes. Then, P Q 
iff P +r Q and Q +r P. 

The logical characterization of resource equivalence can be easily extended 
to the weak case. It is sufficient to extend the alphabet and to introduce a r 
modality. Then, within the actual definition of the satisfaction relation 



— #(({| P' I 3m. P p' H) n {|P' I P' \= = n has to be replaced by 



— P Q has to be replaced by P Q- 

Proposition 5. The equivalence induced by extended weak GHML formulae 
coincides with weak resource equivalence. 



— P p' has to be replaced by P pc 




Graded Modalities and Resource Bisimulation 



391 



Table 5. The r— law for EPL. 



ot'T'X = ot'X (Tl) 



A sound and complete axiomatization of weak resource equivalence over PL 
processes can also be provided. We can prove that the new weak equivalence 
is fully characterized by the axiom of Table 5 (please remember that now a S 
Au{r}) together with the set of axioms of Table 1 (which soundly and completely 
axiomatize strong resource equivalence [6]). 

Proposition 6. The axiom of Table 5 together with the set of axioms of Table 1 
soundly and completely axiomatize weak resource bisimulation over PL. 

Remark 2. Consider the axiom of Table 5 and replace action a with /i. The 
resulting axiom, fx»T»X = is not sound. Indeed, by letting /r = 1 and 

A = 1 we would have that l»r»l and 1*1 are related by the equational theory, 
while they are not weak resource equivalent as remarked above. Therefore the 
axiom X’T'Y = X’Y is not sound. 

5 Conclusions 

We have introduced graded modalities and used them to provide a logical char- 
acterization of the strong and weak versions of resource bisimulation, an equiva- 
lence which discriminates processes according to the number of different compu- 
tation they can perform to reach specific states. As a result, resource bisimulation 
identifies all and only those labelled transition systems that give rise to isomor- 
phic unfoldings. In the case of the weak variant this isomorphism is guaranteed 
up to ignoring the invisible r— actions. We have also extended the complete ax- 
iomatization of strong resource bisimulation of [6] to the weak variant of the 
equivalence. 

The results that we have obtained for regular espressions can easily be ex- 
tended to full-fledged process algebras like CCS, CSP, ACP or variants thereof 
that are equipped with a structural operational semantics, if care is taken to 
properly model the choice operators. 

The logic we have introduced to characterize both resource and weak resource 
bisimulation, can easily be related with other modal logics introduced for dealing 
with bisimulation. In particular, referring to [9] as a comprehensive treatment, 
we can describe our logic as a polymodal Ho graduated logic. Also in [9], bisim- 
ulation is a ^-counting bisimulation, that in the case k = Ho coincide with our 
resource bisimulation. On the other hand the two points of view are completely 
different and, in some sense, complementary. There, one was interested in the 
largest logic (the more expressive one) invariant under bisimulation. Here, we 
are looking for the minimal logic that is sufficient for characterizing bisimilar 



392 



Flavio Corradini et al. 



processes. As a consequence, our logic is extremely poor in connectives (just the 
conjunction) as well as in atomic propositions (just tt). In this way we showed 
that for example negation is not necessary. We have extended also our result to 
the case in which a “silent” relation r between worlds is allowed, while we have 
not explicitly treated infinite terms. Nonetheless it can be immediately seen, by 
looking to the structure of proofs, that, if we allow infinite terms corresponding 
to behaviours with finite branching, e.g. guarded by ^ operators, all the results 
will still hold. This is in accordance with the fact that, in the modal logics quoted 
above, fj, operators are introduced for positive formulae only, and our language 
consists of strict positive formulae. 

References 

1. Baeten,J.C.M., Bergstra,J.A.: Process Algebra with a Zero Object. In Proc. Con- 
cur’90, LNCS 458, pp. 83-98, 1990. 383 

2. Browne, M.C., Clarke, E., Grlimberg O.: Characterizing Finite Kripke Structures in 
Propositional Temporal Logic. Theoretical Computer Science 59(1,2), pp. 115-131, 
1998. 381 

3. Baeten,J., Weijland, P.: Process Algebras. Cambridge University Press, 1990. 

4. Corradini, F., De Nicola, R. and Labella,A.: Fully Abstract Models for Nondeter- 
ministic Regular Expressions. In Proc. Concur’95, LNCS 962, Springer Verlag, pp. 
130-144, 1995. 

5. Corradini,F., De Nicola,R. and Labella,A.: A Finite Axiomatization of Non deter- 
ministic Regular Expressions. Theoretical Informatics and Applications. To appear. 
Available from: ftp://rap.dsi.unifi.it/pub/papers/FinAxNDRE. Abstract in EICS, 
Brno, 1998. 

6. Corradini,F., De Nicola, R. and Labella,A.: Models for Non deterministic Regular 
Expressions. Journal of Computer and System Sciences. To appear. Available from: 
ftp://rap.dsi.unifi.it/pub/papers/NDRE. 381, 382, 383, 385, 391, 391 

7. Clarke, E.M., Emerson, E. A., Sistla,A.P.: Automatic Verification of Finite State 
Concurrent Systems using Temporal Logic Specifications. ACM Toplas 8(2), pp. 
244-263, 1986. 381 

8. Cleaveland,R., Parrow,J., Steffen, B.: The Concurrency Workbench. ACM Toplas 
15(1), pp. 36-72, 1993. 381 

9. D’Agostino,G.: Modal Logics and non well-founded Set Theories: translation, 
bisimulation and interpolation. Thesis, Amsterdam, 1998. 391, 391 

10. De Nicola,R.: Extensional Equivalences for Transition Systems. Acta Informatica 
24, pp. 211-237, 1987. 381 

11. De Nicola,R., Labella,A.: Tree Morphisms and Bisimulations, Electronic Notes in 
TCS 18, 1998. 

12. De Nicola,R., Vaandrager,E.: Three Logics for Branching Bisimulation. Journal of 
ACM 42{2), pp. 458-487, 1995. 381 

13. Emerson, E.H., Halpern,Y.: “Sometimes” and “not never” revisited: On branching 
versus linear time temporal logic. Journal of ACM 42, pp. 458-487, 1995. 381 

14. Fattorosi-Barnaba,M., De Garo,F.: Graded Modalities, I. Studia Logica 44, pp. 
197-221, 1985. 382, 386 

15. Hennessy,M., Milner, R.: Algebraic Laws for Nondeterminism and Concurrency. 
Journal of ACM 32, pp. 137-161, 1985. 381, 386 



Graded Modalities and Resource Bisimulation 



393 



16. Hoare, C.A.R.: Communicating Sequential Processes, Prentice Hall, 1989. 

17. van GlabbeekjR. J.: Gomparative Concurrency Semantics and Refinement of Ac- 
tions. Ph.D. Thesis, Free University, Amsterdam, 1990. 381 

18. Milner, R.: Communication and Concurrency, Prentice Hall, 1989. 389 

19. Manna, Z., and Pnueli,A.: The Temporal Logic of Reactive and Concurrent Systems. 
Springer Verlag, 1992. 381 

20. Park,D.: Concurrency and Automata on Infinite sequences. In Proc. GI, LNCS 
104, pp. 167-183, 1981. 

21. Salomaa,A.: Two Complete Axiom Systems for the Algebra of Regular Events. 
Journal of ACM 13 , pp. 158-169, 1966. 382 



The Non-recursive Power of Erroneous 
Computation* 



Christian Schindelhauer^’^ and Andreas Jakoby^’^ 

^ ICSI Berkeley, 1947 Center Street, Berkeley, USA 
schindelOicsi .berkeley.edu 
^ Depart, of Computer Science, Univ. of Toronto, Canada 
j akobyScs . toronto . edu 

^ Med. Univ. zu Liibeck, Inst, fiir Theoretische Informatik, 
Liibeck, Germany 



Abstract. We present two new complexity classes which are based on a 
complexity class C and an error probability function F. The hrst, A-ErrC, 
reflects the (weak) feasibility of problems that can be compnted within 
the error bound F. As a more adequate measure to investigate lower 
bounds we introduce A-Errio C where the error is infinitely often bounded 
by the function F. These definitions generalize existing models of feasible 
erroneous computations and cryptographic intractability. 

We identify meaningfnl bounds for the error function and derive new 
diagonalizing techniques. These techniques are applied to known time 
hierarchies to investigate the influence of error bound. It turns out that in 
the limit a machine with slower rnnning time cannot predict the diagonal 
language within a significantly smaller error prob. than | . 

Further, we investigate two classical non-recursive problems: the halt- 
ing problem and the Kolmogorov complexity problem. We present strict 
lower bounds proving that any henristic algorithm claiming to solve one 
of these problems makes unrecoverable errors with constant probability. 
Up to now it was only known that infinitely many errors will occur. 



1 Introduction 

The answer of the question whether NV equals A is a main goal of computational 
complexity theory. If they differ, which is the widely proposed case, a polynomial 
time bounded deterministic algorithm cannot correctly decide an AfA-complete 
problem. Hence, the correctness of such an algorithm is the most reasonable 
requirement. Nevertheless, in the desperate situation where one wants to solve 
an infeasible problem one may accept errors, provided their influence can be 
controlled somehow. The quality of such an error can be exploited in more detail. 

Probabilistic error: It is common sense that BW can be seen as a class of 
efficiently solvable problems. Unlike an A-algorithm a SAA-algorithm can make 
errors, but on every input this error probability has to be bounded by So, 

* parts of this work are supported by a stipend of the “Gemeinsames Hochschulson- 
derprogramm III von Bund und Lander” through the DAAD. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 394—406, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



The Non-recursive Power of Erroneous Computation 



395 



yBPT^-algorithms can be modified to decrease this error probability to an arbi- 
trarily small non-zero polynomial. Furthermore, ,67^7^-algorithms solve problems 
for which no 7^-algorithm is known yet, i.e. test of primalty. 

Expected time: A valid enhancement of the notion of efficiency is to mea- 
sure the expectation over the running times of different inputs according to a 
probability distribution over the input space. E.g. it turns out that the Davis- 
Putnam-algorithm [DaPu 60] solves SAT for randomly chosen Boolean formu- 
las in polynomial expected time, if the probability distribution fulfills certain 
requirements [PuBr 85,SLM 92]. Note that, since this algorithm is determinis- 
tic, the probability refers only to the way of choosing the input. Probabilis- 
tic algorithms (using random bits) have been investigate in this relationship, 
too [Hoos 98]. An algorithm which is efficient with respect to its expected time 
behaviors has to compute a function always correctly, but its computation time 
can exceed the expected time bound for some inputs, enormously. 

Average complexity classes: It turns out that complexity classes defined by ex- 
pected time bounds are not closed under polynomial time bounded simulations. 
For this reason Levin defined polynomial on the average [Levi 86] , a superset of 
expected polynomial time that initiated the average-case branch of computa- 
tional complexity. Using Levin’s measure for the average complexity there exists 
a reasonable notion of MV -average-case-completeness. 

Traditionally, one investigates the worst case of resources over an input length 
needed to solve a problem. In average case complexity theory these resources are 
weighted by a probability distribution over the input space. As a consequence, 
in worst case theory it is only necessary to consider functions / : INf ^ INf for an 
entire description of the considered complexity bound. In average case complex- 
ity theory there are a variety of ways to average over the resource function. Here 
it is necessary to examine pairs of resource functions / : A* — *■ IN (e.g. time) 
and probability distributions over the set of inputs. A variety of different con- 
cepts are investigated to define average complexity measures and corresponding 
classes [Levi 86, Cure 91, BCG 92,ScWa 94,CaSe 99,ReSc 96]. All these concepts 
have in common that the running times of all possible input values with posi- 
tive weights account for the average behavior. An important result in average 
complexity is that if average-V (AvV) covers AfV, then Af£ = £ [BCG 92]. Fur- 
thermore, the fraction of non-polynomial computations of an algorithms solving 
an AfT^-problem can be bounded under this premise. 

Benign faults: Here the algorithm outputs a special symbol “?” on inputs 
where it fails, yet produces correct outputs in polynomial time for all other 
inputs. An algorithm for an AfT^-problem producing only a small quantity of so- 
called benign faults can be transformed into an AvT^-algorithm [Imp2 95] . This 
observation is intuitively clear. Since, if an algorithm “knows” that it cannot 
solve a certain instance of an AfT^-problem, it can use the trivial exponentially 
time-bounded simulation of the nondeterministic Turing-machine. If the proba- 
bility for these instances is exponentially small, the resulting average-case time 
stays polynomial. 



396 Christian Schindelhauer and Andreas Jakoby 



Similar questions were considered in [Schi 96], e.g. Schindelhauer introduces 
the class MedDistTime(T,F) which is strongly related to the statistical p-th 
quantile. Here, a machine M may violate the time bound T’(lxj) for at most 
F{t} of the ^ most likely inputs x. But the machine has to decide on the lan- 
guage correctly. This setting of correctness is equivalent to benign fault when we 
consider time-bounded complexity classes. 

Real faults: In [ImWi 98, Schi 96,ScJa 97,Yaml 96,Yam2 96] a different ap- 
proach is introduced. They investigate the number of inputs for a given machine 
causing an erroneous computation or breaking the time limit respectively. Note 
that for a fix time bound exceeding the time limit causes an error. 

Yamakami proposed the notion of Nearly- V and Nearly- BW (see [Yaml 96] 
and [Yam2 96] ) . Here the error probability has to be smaller than any polynomial 
in the input length. More precisely, even a polynomial number of instances (cho- 
sen according to the probability distribution) induces a super-polynomial small 
error bound again. Thus, Nearly-?^ and Nearly- BW define reasonable efficient 
complexity classes if the corresponding algorithm is only used for a polynomial 
number of inputs for each length. 

Independently, Schindelhauer et al. in [Schi 96,ScJa 97] introduced a similar 
approach. Based on a T-time decidable languages L and an error probability 
function F they investigate pairs of languages L' and probability distribution ^ 
where for all ^ G IN the number of the £ most likely inputs x G S* (according 
to n) with L'{x) L{x) is bounded by F{£). 

Impagliazzo and Wigderson in [ImWi 98] investigate the relationship between 
BW and an error complexity class called HeurTimeg(„) (T(n)), where for each 
input length the number of unrecoverable errors of a T-time bounded algorithm 
is bounded by e. Their main motivation for this definition is the better under- 
standing of the relationship of BW, T\poly and £. 

Erroneous Computation has a practical impact in designing more efficient 
algorithms, see for example [Fagi 92,GeHo 94,Reif 83]. In these papers parallel 
algorithms, resp. circuits, for adding two large binary numbers are presented 
which are efficient on the average. The basic part of these strategies is a fast but 
erroneous algorithm with a polynomial share of inputs causing wrong outputs. 
This results in a double logarithmic time bound. An additional error-detection- 
strategy restores correctness. 

Based on this work done so far, it is reasonable to extend these definitions 
to arbitrary classes C and to consider the general properties of error complexity 
classes. 

Definition 1 For a class C and a bound T : IN — > [0, 1] define the distribu- 
tional complexity classes of F -error bounded C and infinitely often F -error 
bounded C as sets of pairs of languages L C E* and probability distributions 
pL : E* [0, 1] as follows: 

F-ErrC := {(T,/i) \ 3S G C Wn : Prob;,[x G {L A S) \ x G E^] < T(n) } , 
T-ErrioC := {(T,/i) j 3S G C 3ioU : Prob^[x G (T A S') j x G Y"] < F{n)} 
where A A B := {A \ B) U {B \ A) as the symmetric difference of sets A and B. 



The Non-recursive Power of Erroneous Computation 



397 



Figure 1 illustrates the error behavior of languages 81,32,83 G C with respect 
to a given language L. 3 \ provides smaller error probability than F for all inputs. 
Hence, proves that L € F'-ErrC. The error probability of S 2 will infinitely 
often fall below the error bound F. If no language in C with the behavior of 5i 
and 82 exists, L cannot be approximated by any language in C. It follows that 
L ^ F-ErrioC. Figure 2 illustrates the error probability of 3 w.r.t. L: L G 
Fi-Err{S'} and L G F2-EiTio {S'}, but L ^ F2-Err (Sj and L ^ Fs-Errio (Sj. 




Fig. 1. The error probability of 
languages Si, 82, and S3 with re- 
spect to L for increasing input 
length n. 




Fig. 2. F3 gives a lower bound of the 
error probability of a language S with 
respect to the io-measure. The error 
bound Fi gives an upper bound for 
both classes. 



Using definition 1 we can generalize the classes Nearly-SFF (introduced by 
Yamakami [Yaml 96,Yam2 96]) and Nearly-F by Nearly-C := n “(^)-ErrC 
for an arbitrary class C. Nearly-C represent complexity classes with a very low 
error probability. 

Computational classes do not only describe the feasibility of problems, some- 
times they are used to ensure intractability. An important application is cryp- 
tography, where one wants to prove the security of encryption algorithms, inter- 
active protocols, digital signature schemes, etc. The security is often based on 
intractability assumptions for certain problems, e.g. factoring of large numbers 
or the computation of quadratic residuosity. Problems are called intractable if 
every polynomial time bounded algorithm outputs errors for a sufficient high 
number of instances. 

An adequate measure of lower bounds turns out to be F-ErrjoC, which gen- 
eralizes existing models of cryptographic intractability, e.g. 

Definition 2 [GMR 88] A function f is GMR-intractable for a probability 
distribution p, if for all probabilistic polynomial time bounded algorithms A it 
holds Vc > 0 Voefc : Prob^[A(a;) = f{x) \ x G 

Let FBW be the functional extension of BVV, or more precisely: / G FBW 
iff there exists a polynomial time bounded probabilistic Turing machine M such 
that Prob[M(a:) 7^ f{x)] < Note that the error of i can be decreased to 



398 Christian Schindelhauer and Andreas Jakoby 



any polynomial, without loosing the polynomial time behavior of M . To classify 
GMR-intractable functions by error complexity classes we take an appropriate 
generalization of J^-ErrioC for functional classes TC, i.e. (/, A*) € E-ErriolFC iff 

3g e J^C 3^on : Prob^[/(x) yf g{x) \ x e If"] < F{n) . 

Proposition 1 GMR-intractable functions are not in (l — n -t3(i))_ErrioJ^ePP. 

The intractability assumption of [GoMi 82] and the hard-core sets of [Impl 95] 
refer to non-uniform Boolean circuits. These classes can analogously be expressed 
by using error complexity classes. 

In the rest of this paper we concentrate our considerations on complexity 
classes of languages and the uniform distribution /iuni as underlying probability 
distribution, where for all n S IM and x,y € 17” holds ^uni(a:) = huni{y) > 0. For 
the sake of readability we omit the distribution: 

L ^ P-Errd C E-ErrC , 

L C P-ErrioC (L,/Xuni) C E-ErrioC . 

In this paper we show a first classification of suitable error bounds and extend 
these lower bound results to time hierarchies. In section 4 we discuss in detail 
the error complexity of the halting problem and the Kolmogorov complexity 
problem. Non-recursiveness does not imply high error bounds in general. For 
the halting problem the Gbdel-enumeration of programs has to be examined 
under new aspects. For some enumerations there are algorithms computing the 
halting problem within small error probability. We show in the following that 
even a standard enumeration yields a high lower error bound. In the case of 
Kolmogorov-complexity the lower bound is even higher and worst possible: We 
give a constant bound independent from the encoding. 



2 Notations 

For a predicate P{n) let 'iae'n : P{n) be equivalent to 3noVn > ng : P(n) 
and 3ion : P(n) to Vng3n > uq : P(n). Further, define f(n)<aeg(n) as 

Vaen : f(n) < g(n) and f(n)<iog(n) as 3ioU : f(n) < g{n). 

We consider strings over an at least binary alphabet 17, where A denotes 
the empty word. Define If-” := Further, we use the lexicographical 

order function ord : 17* i-^ IM as straightforward isomorphism and its reverse 
function str : IM i— > 17*. TZ£ and TZ8C define the sets of all partial recursive, resp. 
recursive predicates. For a partial function / the domain is called dom(/). 
We use f{x) = _L to denote x ^ dom(/). Furthermore, let Lq, L^, L 2 , ... be 
an enumeration of the languages in TZ£ over a given alphabet 17. For easier 
notation we present a language Li resp. Li[a,b] by its characteristic string 
Li[a,b] := Li{a) ■ ■ ■ Li{b), where Li(ord(?c)) = 1 if w € Li and 0 otherwise. 

For a partial recursive function ip and for all x,y S 17* define the relative 
Kolmogorov complexity as C,p{x\y) := minjjpj : (p{p,y) = x} . A program- 
ming system ip is called universal if for all partial recursive functions / holds 



The Non-recursive Power of Erroneous Computation 



399 



S E* : Cip{x\y) < Cf{x\y) + 0(1). Fixing such a universal program- 
ming system (p we define C,p{x) := Oc^(a;|A) as the (absolute) Kolmogorov 
complexity of x. 

3 The Bounds of Error Complexity Classes 

When the error probability tends to 1, the error complexity becomes meaningless. 
But for the io-error classes the corresponding probability is 

Proposition 2 Let C,C' be complexity classes, where C is closed under finite 
variation and C contains at least two complementary languages. Then for all 
functions z{n) =ae 0 and z'{n) =io 0 it holds that 

2^ = -ErrC and 2^ = ^ | ^ -Errio C' . 

This only holds for decision problems — the situation for functional classes is 
very different. Further note that this proposition holds for all classes C covering 
the set of regular languages. Consequently, all languages (even the non-recursive 
languages) can be computed by a finite automaton within these error probabil- 
ities. The first upper error bound of this proposition is sharp: Using a delayed 
diagonalization technique we can construct a language L that cannot be com- 
puted with an arbitrary Turing machine within an error probability 1 — if 
e(n) >jo 1- 

Theorem 1 There exists a language, such that for all funct. e(n) >io 1 it holds 

L (i_ ^)_Err7^£: . 

To prove that the upper bound of the io-error complexity measure is tight in 
the limit we will use the following technical lemma dealing with the Hamming 
distance \x — y\ of two binary strings x,y G {0, 1}". 

Lemma 1 Let m(k, n) := ( 1 ) holds 

1. Let xi,...Xk G {0,1}^ and d G IN with k • m{d,t) < 2^. There exists 

y G {0,lY such that miui^^i j,,j{\xi — y\) > d. 

2. For a < ^ and a • n G IN it holds that m{a ■ n,n) < 

3. For f € Cij{^/ri) there exists g G co{l) : g(ji) • m {n/2 — f(ji),n) <ae 2”. 

One may wonder whether high error complexity implies high Kolmogorov 
complexity. At least the contrary is true. 

Lemma 2 

LG^-Err7^£C ^ 3c Vn : C(L n E") < logm(f(n), jE^j) + c 
L G ^-Errio 7^£C ^ 3c Bi^n : C(L n E") < logm(f(n), jE^j) + c 




400 Christian Schindelhauer and Andreas Jakoby 



In [GGH 93] a similar result is presented. They relate average time complex- 
ity and time-bounded Kolmogorov complexity. 

The Kolmogorov complexity of enumerable sets is low, since for all enumer- 
able sets L it holds Vn : C'(Lni7") < n -1-0(1). For an excellent survey over this 
field see [LiVi 97] . We show an explicit construction of a diagonal language with 
low Kolmogorov complexity (comparable to an enumerable set), giving a strict 
lower bound for the io-error complexity. This language L will be constructed by 
using the Hamming distance between k sublanguages Lo[i,i + £],..., Lk[i,i + i] 
of length i for increasing k and 1. We will show that L cannot be approximated 
by any Turing-machine within an io-error of c for any constant c < 1/2. 

Theorem 2 For any e G w(l) there exists L ^ ~ ) “Errio7?.£ such 

that^n : C{L C\ < n + e{n) . 

Proof: For a function / e let a{m) := 5 - g{m) := 

and y(x) := min{ ^ G IN j g{\S\^) >x}. 

We construct a language L which cannot be approximated by partial recur- 
sive languages Lq,Li,... within an io-error probability of a{\E\'^) as follows: 
Define hi := ]A"^*j, ki := jH*] and choose a sublanguage Si^i lexicographically 
minimal such that for all j < i holds \Lj [bi,bi^i — 1] — Se^i\ > a{k) ■ ki. From 
Lemma 1 we can conclude that for all £ € IN and c > 1 such a language Si s{e)+c 
always exists. Finally, we define the language L as the concatenation of the sub- 
languages Sei,i for i = 0, 1, 2, . . . and ii := [gdH]*)] . From the definition of 
it follows, that for each language Li, there exists an index j, such that for all 
n > j the Hamming distance of Li n if” and L n if” is at least a{\S\’^) ■ JL’]"'. 
That means, Li cannot predict the language L restricted to words of length n 
within an error probability smaller than a(ji7j”) = ^ 

On the other hand the sequence Lq [bn, bn+i — 1]. . . Lg(„) [bn, bn+i — 1] can 
be reconstructed if g{n), n, and the number of elements of the correspond- 
ing sublanguages are known. Thus, it has a Kolmogorov complexity of at most 
0(logg(n)) -I- log(g(n) • JL'"']) -|- c\ for constant ci. Using this sequence we can 
easily construct LnU”. Hence, the Kolmogorov complexity of LnU" is bounded 
by n -|- e(n) for an arbitrarily small e G w(l) if / is chosen appropriately. □ 

If we restrict ourselves to computational complexity classes we can apply the 
results shown so far also to classes specified by time bounded Turing machines. 
Let DTime(T) resp. DTimefc(T) be the class of all languages which can be 
accepted by a T-time bounded deterministic (fc-tape) Turing machine. We call a 
function / T-time fe-tape computable, if / G TDTimefc(r) and it is called 
time constructible, if T G TDTime 2 (T). In [GaSe 99,ReSc 96] tight average 
time hierarchies are presented for very carefully defined average time classes. We 
state corresponding hierarchies for both error-classes following from Theorem 1 
and 2. 

Corollary 1 LetTi,T 2 be two time- constructible functions withTi G uj{T 2 ) and 
f ^io 1, f G o{V df) T\-time computable functions. Then for k >2 there exists 



The Non-recursive Power of Erroneous Computation 



401 



a function S € w(l) with S ■ T 2 €: o(Ti) such that it holds 

DTimefe(Ti) % (1 _ _ErrDTimefc(T 2 ) , 

DTimefe(Ti) % (l ~ /(|i;| 5 (r.)) ) "Errio DTimefc(T 2 ) . 

Hence, there are languages computable in time T which cannot be accepted 
by a Turing machine with asymptotically slower running time within an error 
significantly smaller than i. Of course, these results can also be transfered to 
other computational resources like space or reversals. To transfer these results to 
DTime(T), an additional factor of logT for the tape reduction has to be taken 
into account. 



4 Partial Recursive Functions with High Error Bounds 

In the last section we showed that there are languages which cannot be approx- 
imated by any partial recursive language within an io-error bound significantly 
smaller than 1/2. To identify some well known languages which cannot be ap- 
proximated by a partial recursive language within a (nearly) constant io-error 
fraction we consider the Halting Problem and the Kolmogorov complexity. Note 
that their complements are not partial recursive. 

We will show that both problems cannot be solved by a recursive function 
within an io-error smaller than a constant, with the constant depending on the 
chosen programming system. That means that any algorithm, like a universal 
program checker, claiming to solve one of these problems within a neglectable 
small error, fails. 



4.1 Lower Bounds for the Halting Problem 

The halting problem occurs for various models of computation. Obviously, the 
error complexity depends on the chosen programming system. To get a general 
approach, we follow the notation of [Smit 94]; 

Definition 3 A programming system cp is a sequence ■ ■ ■ of all 

partial recursive functions such that there exists a universal program u G IN 
with pu{{hx)) = pi{x) for a bijective function (•, •) : S* X S* —>■ S* called 
pairing function. The halting problem H^p for a programming system p is 
defined as: Given a pair {i,x), decide whether x € dom((/?i(a;)). 

The programming system highly influences the error complexity of the halting 
problem. Consider for example a programming system for a binary alphabet 
where '02* = Ti ipj describes the identity function for all j yf 2*. Of course 
this anomalous programming system allows a program to compute the halting 
problem within exponential small error probability. To restrict the programming 
systems we define: 



402 Christian Schindelhauer and Andreas Jakoby 



Definition 4 The repetition rate of domain equivalent partial functions 
RD^^i is defined as RD^p^iin) := Prob[dom(</3a;) = dom(<^j) | x S A'"]. A 
programming system ip is dense, iff\/i3c>0 : RD,p^i(n) >ae c. 

Note that most of real world programming systems, like PASCAL, are dense. 

The second parameter directly influencing the error complexity of the halting 
problem is the pairing function of the universal program. We can change the 
situation considerably, if the pairing function is chosen appropriately. Then we 
can achieve the highest possible error complexity using a diagonalization. 

Theorem 3 There exists a pairing function such that for all programming sys- 
tems ip and for any function f <ae 1 it holds H,p f-YjcrTZ£C 

This only shows that an artificial pairing function can cause high error complex- 
ity. To derive more general results we define the notion of pair-fairness. 

Definition 5 We call a pairing function pair-fair, if for sets X,Y C S* with 
3ci > 0 Vn : ^ > ci and 3ii,£2 G IN : F = {w | ord(w) = £i (mod £ 2 )} it 

holds 3 c 2 ^aen : Prob[x G X Ay GY \ (x,y) G A”] > C 2 . 



Proposition 3 The standard pairing (x, y) = x -\- pair-fair. 

In the following we restrict our considerations to fair pairing functions (•, •). 
Before we can prove a lower bound of the error complexity of the halting problem, 
we have to show the following technical lemma. 

Lemma 3 For a pair-fair function (•, •) and a pairing function ((•, •)) it holds 
Vi Vx Vy : dom((/33,) = dom((^j) ^ (p,{{x, {{x,y)))) H^{{x, {{x,y)))) . 

Choosing ((x, y)) := 2^ ■ {2y -|- 1) — 1, we can conclude from Lemma 3: 

Theorem 4 For any dense programming system ip it holds 

VM 3a > 0 Vae« : Prob[M(x) yf H^p{x) \ x G A"] > a . 

This means that every heuristic that claims to solve the halting problem makes 
at least a constant fraction of errors. 

Corollary 2 For any dense progr. system p and any function f G w(l), it holds 
H,p ^ j-Errio 72.5C. 

The question whether or not there exists a constant lower bound for the io-error 
complexity of halting is still open. Perhaps the trivial constant upper bound 
can be improved by showing that for a sequence of Turing machines the error 
complexity tends to zero in the limit. The last corollary implies that F[^ ^ 
Nearly- 7?,£C. Thus, even an improved upper bound would not be helpful for 
practical issues. 



The Non-recursive Power of Erroneous Computation 



403 



4.2 Lower Error Bounds for Kolmogorov Complexity 

Another well known non-recursive problem is Kolmogorov complexity. One of 
the fundamental results of Kolmogorov complexity theory is the proof of the 
existence of a universal programming system. Such a programming system is not 
necessarily dense, although many programming systems provide both properties. 
A sufficient condition for both features is the capability of the universal program 
to store a fixed input parameter one-to-one into the index string, that means 
there exists a function s : S* E* such that for all x,y it holds (pu{{x,y)) = 
^s(x){y) and |s(a;)| = \x\ + 0(1). This observation implies a trivial upper bound 
for the Kolmogorov complexity of x. 

We consider the following decision problem based on the Kolmogorov complexity 
for a classification of its io-error complexity. 

Definition 6 For a function / : IN ^ IN define C < f as the set of all in- 
puts X with Kolmogorov complexity smaller than /(|a;|), i.e. C<f := {x S 
E* I C(a;) < /(|a;|)}. For a constant c we define the function Kc : IN — > [0, 1] as 
Kc{n) := Prob[C(a;) < n — c\x&E"’]. 

In general, the functions C and Kc are not recursive. But at least C< j is re- 
cursively enumerable. In the following we will investigate the size of C<„_c and 
show a linear lower and upper bound. 

Lemma 4 For any constant c > 1 there exist constants k\,k 2 > 0 such that 

k\ Eae Eiae 1 ^2 ■ 

It is well known that for small recursive functions f,g< logn with / S f^{g) and 
g S w(l) the set C</ is partially recursive. Furthermore, no infinite recursive 
enumerable set A is completely included in C</, i.e. AnC</ yf 0. The following 
Lemma substantiates the size of this non-empty set An C</ for /(n) = n — c. 

Lemma 5 Let A G TZ£ such that \A n A"! >ae ci • |A"| for constant ci > 0. 
Then it holds Vc 2 > 0 3cs > 0 : Prob[C(a;) < n — C 2 \ x G Af] A"] >ae C 3 . 

Using these lemmas the following lower bound can be shown. 

Theorem 5 For any c > 1 there exists a constant a < I such that 

C<n-c ^ a-Eri'ioT^fC . 

Proof: For a machine M define Kn '■= A”nC<„_c, A„ := {x G A” | M{x) = 0}, 
and Fn := {x G A” \ M{x) ^ C<n-c{x)} for all n S IN. 

Note that A" \ (A„ A Kn) Q Fn and A„ n Kn C Fn- From Lemma 4 we 
can conclude that k\ ■ |A"| <ae \Kn\ <ae (I — ^ 2 ) • and therefore either 
Cl ■ |A"| <ae |A„| or \An A Kn\ <ae (I ~ C 2 ) • lA"! for some constants 
ki,k 2 ,ci,C 2 > 0. Using Lemma 5 it follows that for a constant C 3 > 0 |A„ n 
Kn\ >ae C 3 • |A"| Or |A„ A | <ae (I ~ C 2 ) • |A”|. Filially, we can conclude that 
for some constant C 4 > 0 it holds |F„| >ae C 5 • |A”|. □ 

It follows that there exists a fixed constant a such that no matter which 
algorithm tries to compute C<„_c it fails for at least a fraction of a of all 
inputs. 



404 Christian Schindelhauer and Andreas Jakoby 



5 Discussion 

One might expect that the concept of immune sets and complexity cores are 
suitable for showing lower bounds. Recall that a set S is called C -immune if it 
has no infinitive subset in C. S is called C-bi-immune if S and S are immune 
for C. A recursive set X is called a complexity core of S if for every algorithm M 
recognizing S and every polynomial p, the running time of M on x exceeds p(|a;|) 
on all but finitely many x € X. 

Orponen and Schoning [ScOr 84] observed that a set S ^ V is bi-immune for 
V iff E* is a complexity core for S. But this does not imply reasonable lower 
bounds for the error complexity, since a precondition for complexity cores is the 
correct computation of an algorithm. Since bi-immune sets may be very sparse, 
even the trivial language 0 gives a low error bound. Thus, the knowledge of a 
bi-immune set S for C does not result in a high error complexity. It is an open 
problem how density for bi-immune sets has to be defined such that reasonable 
results for the error complexity can be achieved. 

However, we can show that the existence of a immune set of a class C corre- 
sponds to a small error bound separation: 

Theorem 6 Let C he closed under finite variation and 9 G C. There exists an 
C -immune set iff C yf j^^-ErrC n . 

Since some elementary closure properties of C guarantee the existence of 
a C-immune set [BoDn 87], this restriction is not severe. On the other hand 
in [Yesh 83] it is shown that the structural property of conjunctively-self-re- 
ducibility suffices to overcome all erroneous outputs. 

As shown in section 4 the Kolmogorov complexity problem, i.e. the question 
to decide whether a string can be compressed more than a constant, cannot be 
computed by any machine within a smaller error probability than a constant. 
It is notable that this error probability is independent from the machine. Both, 
the halting and the Kolmogorov problem are not in Nearly-yB'P'P and remain 
intractable with respect to the set of recursive predicates. 

Because of their strong relationship to the halting problem some other prob- 
lems - like program verification or virus-program detection - are not recursive 
in the general case, too. So, it seems that the lower bounds proved so far influ- 
ence the error bounds of these problems. The exact classification is still an open 
problem. 



Acknowledgment 

We would like to thank Karin Genther, Rudiger Reischuk, Arfst Nickelsen, Ger- 
hard Buntrock, Hanno Lefmann, and Stephan Weis for helpful suggestions, critics 
and fruitful discussions. Furthermore, we thank several unknown referees and the 
members of the program-comitee for their suggestions, comments and pointers 
to the literature. 



The Non-recursive Power of Erroneous Computation 



405 



References 



BCG 92. 

BoDu 87. 
CaSe 99. 
DaPu 60. 
Fagi 92. 
GeHo 94. 
GGH 93. 
GoMi 82. 

GMR 88. 

Gure 91. 
Hoos 98. 
Impl 95. 

Imp2 95. 

ImWi 98. 

Levi 86. 
LiVi 97. 
PuBr 85. 
Reif 83. 
ReSc 96. 

Schi 96. 



S. Ben-David, B. Ghor, O. Goldreich, M. Luby, On the Theory of Average 
Case Complexity, Journal of Computer and System Sciences, Vol. 44, 1992, 
193-219. 395, 395 

R. Book, D. Du, The Existence and Density of Generalized Complexity 
Cores, Journal of the ACM, Vol. 34, No. 3, 1987, 718-730. 404 

J-Y. Cai and A. Selman, Fine separation of average time complexity classes, 
SIAM Journal on Computing, Vol. 28(4), 1999, 1310-1325. 395, 400 
Martin Davis, Hillary Putnam, A Computing Procedure for Quantification 
Theory, Journal of the ACM, 1960, 201-215. 395 

B. Fagin, Fast Addition for Large Integers, IEEE Tr. Comput. Vol. 41, 1992, 
1069-1077. 396 

P. Gemmel, M. Horchol Tight Bounds on Expected Time to Add Correctly 
and Add Mostly Correctly, Inform. Proc. Letters, 1994, 77-83. 396 
M. Goldmann, P. Grape and J. H&tad. On average time hierarchies, In- 
formation Processing Letters, Vol. 49(1), 1994, 15-20. 400 

S. Goldwasser, S. Micali, Probabilistic Encryption & How to Play Mental 
Poker Keeping Secret All Partial Information, Proc. 14th Annual AGM 
Symposium on Theory of Computing, 1982, 365-377. 398 

S. Goldwasser, S. Micali, R. Rivest, A Digital Signature Scheme Secure 
Against Adaptive Chosen-Message Attacks, SIAM J. Comput. Vol. 17, No. 
2, 1988, 281-308. 397 

Y. Gurevich, Average Case Completeness, Journal of Computer and System 
Sciences, Vol. 42, 1991, 346-398. 395 

H. Hoos, Stochastic Local Search - Methods, Models, Applications, Disser- 
tation, Technical University Darmstadt , 1998. 395 

R. Impagliazzo, Hard-core distributions for somewhat hard problems. In 
36th Annual Symposium on Foundations of Computer Science, 1995, 538- 
545. 398 

R. Impagliazzo, A personal view of average-case complexity. In Proceedings 
of the Tenth Annual Structure in Complexity Theory Conference, 134-147, 
1995. 395 

R. Impagliazzo, A. Wigderson, Randomness vs. Time: De-randomization 
under a uniform assumption, Proc. 39th Symposium on Foundations of 
Computer Science, 1998, 734-743. 396, 396 

Leonid Levin, Average Case Complete Problems, SIAM Journal on Com- 
puting, Vol. 15, 1986, 285-286. 395, 395 

M. Li, P. Vitani, An Introduction to Kolmogorov Complexity and its Ap- 
plication, Springer, 1997. 400 

P. W. Purdom, C. A. Brown, The Pure Literal Rule and Polynomial Aver- 
age Time, SIAM J. Comput., 1985, 943-953. 395 

J. Reif, Probabilistic Parallel Prefix Computation, Comp. Math. Applic. 26, 
1993, 101-110. Technical Report Havard University, 1983. 396 
R. Reischuk, C. Schindelhauer, An Average Complexity Measure that Yields 
Tight Hierarchies, Journal on Computional Complexity, Vol. 6, 1996, 133- 
173. 395, 400 

C. Schindelhauer, Average- und Median-Komplexitdtsklassen, Dissertation, 
Medizinische Universitat Liibeck, 1996. 396, 396, 396 



406 Christian Schindelhauer and Andreas Jakoby 



ScJa 97. 

ScOr 84. 
ScWa 94. 

SLM 92. 

Smit 94. 
Yaml 96. 

Yam2 96. 
Yesh 83. 



C. Schindelhauer, A. Jakoby, Computational Error Complexity Classes, 
Technical Report A-97-17, Medizinische Universitat Liibeck, 1997. 396, 

396 

U. Schoning, P. Orponen, The Structure of Polynomial Complexity Cores, 
Proc. 11th Symposium MFCS, LNCS 176, 1984, 452-458. 404 
R. Schuler, O. Watanabe, Towards Average-Case Complexity Analysis of 
NP Optimization Problems, Proc. 10th Annual IEEE Conference on Struc- 
ture in Complexity Theory, 1995, 148-159. 395 

Bart Selman, Hector Levesque, David Mitchell, Hard and Easy Distribu- 
tions of SAT Problems, Proc. 10. Nat. Conf. on Artificial Intelligence, 1992, 
440-446. 395 

C. Smith, A Recursive Introduction to the Theory of Computation, 
Springer, 1994. 401 

T. Yamakami, Average Case Computational Complexity Theory, Phd. The- 
sis, Technical Report 307/97, Department of Computer Science, University 
of Toronto. 396, 396, 397 

T. Yamakami, Polynomial Time Samplable Distributions, Proc. Mathemat- 
ical Foundations of Computer Science, 1996, 566-578. 396, 396, 397 
Y. Yesha, On certain polynomial-time truth-table reducibilities of complete 
sets to sparse sets, SIAM Journal on Computing, Vol. 12(3), 1983, 411-425. 
404 



Analysis of Quantum Functions* 

(Preliminary Version) 



Tomoyuki Yamakami 

Department of Computer Science, Princeton University 
Princeton, New Jersey 08544 



Abstract. Quantum functions are functions that are defined in terms 
of qnantum mechanical computation. Besides quantum computable func- 
tions, we study quantum probability functions, which compute the accep- 
tance probability of quantum computation. We also investigate quantum 
gap functions, which compute the gap between acceptance and rejection 
probabilities of quantum computation. 



1 Introduction 

A paradigm of quantum mechanical computers was first proposed in early 1980’s 
[2,10] to gain more computational power over classical computers. Recent dis- 
coveries of fast quantum algorithms for a variety of problems have raised much 
enthusiasm among computer scientists as well as physicists. These discoveries 
have supplied general and useful tools in programming quantum algorithms. 

We use in this paper a multi-tape quantum Turing machine (abbreviated a 
QTM) [6,4] as a mathematical model of a quantum computer. A well-formed 
QTM can be identified with a unitary operator, so-called a time-evolution op- 
erator, that performs in the infinite dimensional space of superpositions, which 
are linear combinations of configurations of the QTM. 

The main theme of this paper is a study of polynomial time-bounded quan- 
tum functions. A quantum computable function, a typical quantum function, 
computes an output of a QTM with high probability. Let FQP and FBQP 
denote the collections of functions computed by polynomial-time, well-formed 
QTMs, respectively, with certainty and with probability at least 2/3. A quan- 
tum probability function, on the contrary, computes the acceptance probability of 
a QTM. For notational convenience, #QP denotes the collection of such quan- 
tum functions particularly witnessed by polynomial-time, well-formed QTMs. 
Another important quantum function is the one that computes the gap between 
the acceptance and rejection probabilities of a QTM. We call such functions 
quantum gap functions and use the notation GapQP to denote the collection of 
polynomial-time quantum gap functions. 

In this paper, we explore the characteristic feature of these FQP-, FBQP-, 
#QP-, and GapQP-functions and study the close connection between GapQP- 
functions and classical GapP -functions. One of the most striking features of 

* This work was partly supported by NSERC Fellowship and DIM ACS Fellowship. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.): FSTTCS’99, LNCS 1738, pp. 407—419, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 



408 Tomoyuki Yamakami 



quantum gap functions is that if / € GapQP then G #QP, where /^(x) = 

(fix)?- 

We also study relativized quantum functions that can access oracles to help 
their computation. We exhibit an oracle A showing that FQP"^ is more powerful 
than #P"^ and also show the existence of another oracle A such that #QP"^ is 
not included in non-adaptive version of ^QP'^. 



2 Basic Notions and Notation 

We use the standard notions and notation that are found elsewhere. In this 
section, we explain only what needs special attention. 

Let A denote the set of complex algebraic numbers and let C denote the set 
of complex numbers whose real and imaginary parts can be approximated to 
within in time polynomial in n. We freely identify any natural number with 
its binary representation. When we discuss integers, we also identify each integer 
with its binary representation following a sign bit that indicates the (positive or 
negative) sign^ of the integer. Moreover, a rational number is also identified as 
a pair of integers, which are further identified as binary integers. 

Let be the set of all functions that map S* to N. Similarly, we define 
{0, 1}^ , etc. We identify a set S with its characteristic function, which is 
defined as S{x) = 1 if x € S' and 0 otherwise. In this paper, a polynomial with k 
variables means an element in N[xi, X 2 , . . . , x^]. 

Assumed is the reader’s familiarity with central complexity classes, such as 
P, NP, BPP, PP, and PSPACE and function classes, such as FP, ^P [13], 
and GapP [8]. Note that FP C #P C GapP. 

The notion of a quantum Turing machine — abbreviated a QTM — was origi- 
nally introduced in [6] and developed in [4] . For convenience, we use in this paper 
a slightly more general definition of QTMs defined in [14]; a fc-tape QTM M is a 
quintuple {Q, {qo},Qf, Si x S 2 x ■ ■ ■ x Sk, S), where each Si is a finite alphabet 
with a distinguished blank symbol Q is a finite set of states including an initial 
state qo and a set Q/ of final states, and i5 is a multi-valued quantum transition 
functionbom Q X Si X S2 X ■ ■ ■ X Sk to CQ^S^xE2X■■■xSl:X{L,R,N}\ qrp^ 
has k two-way infinite tapes of cells indexed by Z and its read/ write heads that 
move along the tapes either to the left or to the right, or the heads stay still. In 
particular, the last tape of M is used to write an output. It is known in [4,16,14] 
that our model is polynomially “equivalent” to more restrictive model as in [4], 
which is called conservative in [14]. 

Let M be a QTM. The running time of M on input x is defined to be the 
minimal number T, if any, such that, at time T, every computation path of M 
on X reach a certain final configuration. We say that M on input x halts in 
time T if its running time is defined and equals T. We call M a polynomial-time 
QTM if there exists a polynomial p such that, for every x, M on input x halts 
in time p(|x|). A QTM is well-formed if its time-evolution operator preserves 

For example, we set 1 for a positive integer and 0 for a negative integer. 



1 



Analysis of Quantum Functions 409 



the L 2 -norm. For any x, the notation pm{x) denotes the acceptance probability 
of M on x\ that is, the sum of every squared magnitude of the amplitude, in the 
final superposition of M , of any configuration whose output tape constitutes a 
symbol “1” (called an accepting configuration). For a nonempty subset K of C, 
we say that M has K -amplitudes if the entries of its time-evolution matrix are all 
drawn from K. For more notions and terminology (e.g., stationary, synchronous, 
normal form), the reader refers to [4,14]. 

We also use an oracle QTM that is equipped with an extra tape, called a 
query tape and two distinguished states, a pre-query state Pp and a post-query 
state Qa- The oracle QTM invokes an oracle query by entering state Pp. In a 
single step, the content \y o b) of the query tape, where b S {0,1} and “o” 
denotes concatenation, is changed into |y o (6 0 A{y))) and the machine enters 
state Qa- 

For a superposition \cj)), the notation M{\cj>)) denotes the final superposition 
of M that starts with \(p) as an initial superposition. 

There are two useful unitary transforms used in this paper. The phase shift P 
maps jo) to —jo) and |1) to |1). The Hadamard transform H changes |0) into 
^(|0) + |1)) and |1) into ^(|0) - |1)). 

Throughout this paper, K denotes an arbitrary subset of C that includes 
{0,±lj. All quantum function classes discussed in this paper depend on the 
choice of AT-amplitudes. We find it convenient to drop script K when K = C. 

3 Various Quantum Functions 

In this section, we formally define a variety of quantum functions and discuss 
their fundamental properties. Recall that {0, ±1} Q K C C 

3.1 Exact Quantum Computable Functions. We begin with quantum 
functions whose values are the direct outputs, with certainty, of polynomial-time, 
well-formed QTMs. We call them exact quantum polynomial-time computable in 
a similar fashion to polynomial-time computable functions. 

Definition 1. Let FQP^ be the set of K -amplitude exact quantum polynomial- 
time computable functions; that is, there exists a polynomial-time, well-formed 
QTM with K -amplitudes such that, on every input x, M outputs f{x) with cer- 
tainty. 

As noted in Section 2, we drop subscript K when K = <C and write FQP 
instead of FQP^. Note that FP C FQP^. 

We first show a relativized separation result. For a nondeterministic TM M 
and a string x, the notation ffM{x) denotes the number of accepting computa- 
tion paths of M on input x, and let #TIME(t) be the collection of all functions 
Xx.ffM{x) for nondeterministic TMs M running in time at most t(|a;|). For a 
set T, set #TIME(T) = IJtGT #TIME(t). 

Theorem 1. There exists a set A such that FQP^ % ;(f!:TIME(o(2”))^. 



410 Tomoyuki Yamakami 



Proof. For a set A, let f^{x) = • (|A n \ A\)'^ for every x. 

Consider the following oracle QTM N. On input x of length n, write 0" o 0 in a 
query tape and apply " 0 I. Invoke an oracle query and then apply the phase 
shift P to the oracle answer qubit. Again, make a query to A and apply © J. 
Accept X if A^ observes |0" o 0) in the query tape, and rejects x otherwise. It 
follows by a simple calculation that f^{x) = pjqA{x). In particular, belongs 
to FQP"^ if A satisfies the condition that, for every x, either |An | = | 
or |An \ A| = 0. 

Let and be two enumerations of all polynomial-time non- 

deterministic TMs and all nondecreasing functions in o(2"), respectively, such 
that each Mi halts in time at most qi{n) on all inputs of length n. Initially, 
set n_i = 0 and A_i = 0. At stage i of the construction of A, let rii denote 
the minimal integer satisfying that rii-i < rii and qi{rii) < 2”*“^. First let 
B = Ai\J Clearly /^(O"*) = 1. If yf 1, then define Ai to 

be B. Assume otherwise. There exists a unique accepting computation path p 
of Mi on Let Q denote the set of all words that Mi queries along path p. 
Since \Q\ < qi{rn) < 2”*“^, there is a subset C of A”* such that Ai_i C C and 
\C nS^*\ = \ C\. For this C, (0"0 > 1 but /<='(0"0 = «• ° 

3.2 Bounded Error Quantum Computable Functions. By replacing 
exact quantum computation in Definition 1 with bounded-error quantum com- 
putation, we can define another quantum function class FBQP. 

Definition 2. A function f is in FBQP^ if there exist a constant e € (0, and 
a polynomial- time, well-formed QTM with K -amplitudes that, on input x, out- 
puts f{x) with probability |©e. More generally, for a function t, FBQTIME(t) 
is defined similarly but by requiring M to halt in time t(|x|). For a set T, define 

FBQTIME(T) = IJteT FBQTIME(t). 

Clearly, FQP^ C FBQP^ and FBQPq C 

Brassard et al. [5] extend Grover’s database search algorithm [12] and show 
how to compute with e-accuracy the amplitude of a given superposition in 
0{'/N) time, where N is the size of search space, with high probability. In 
our terminology, #TIME(0(n)) C FBQTIME(0(2"/2)). 

Bennett et al. [3], however, show that there exists an NP-set that cannot be 
recognized by any well- formed QTMs running in time o(2"/^) relative to random 
oracle. This immediately implies that #P^ % FBQTIME(o(2"'/^))^ relative to 
random oracle A. 

3.3 Quantum Probability Functions. It is essential in quantum complexity 
theory to study the behavior of the acceptance probability of a well- formed QTM. 
Here, we consider quantum functions that output such probabilities. We briefly 
call these functions quantum probability functions. 

Definition 3. A function f from E* to [0, 1] is called a polynomial-time quan- 
tum probability function with K -amplitudes if there exists a polynomial- time, 
well-formed QTM M with K -amplitudes such that f{x) = Pm{x) for all x. 



Analysis of Quantum Functions 411 



In short, we say that M witnesses in the above definition. 

By abusing the existing Valiant’s notation #P, we can coin the new notation 
for the class of polynomial-time quantum probability functions. 

Definition 4. The notation ^QP^ denotes the set of all polynomial-time quan- 
tum probability functions with K -amplitudes. 

The lemma below is almost trivial and its proof is left to the reader. 

Lemma 1. 1. Every {0,1} -valued FQP -function is in ffQP 

2. For every ffP-function f , there exist an FP -function £ and a ffQfP q- func- 
tion g such that f{x) = £{l^^^)g{x) for every x. 

3. Let f ^ ^QP and s,r ^ FQP. Assume that range(s) C (0, 1) H Q and 
range(r) C (0, 1) flQ. There exists a ffQP -function g such that, for every x, 
f{x) = s{x) iff g{x) = r{x). 

Next we show several closure properties of #QP-functions. 

Lemma 2. Let f and g be any functions in ffQPj^ and h an FQP function. 

(1) If a, (3 G K satisfy |ap-|-|/3p = 1, then Xx.{\a\'^ f{x)-\-\P\'^g{x)) is in ^QP^^. 

(2) f oh G ffQPf^. (3) f ■ g is in ffQPj.^. (f) If h is polynomially bounded^, 
then \x.f{x)^^^'> is in #QPk- 

Proof Sketch. We prove only the last claim. Let a well-formed Mf witness / 
in time polynomial q. Assume that a polynomial p satisfies h{x) < p(|a;|). On 
input X, run Mf h{x) times and idle (?(|a;|) steps p(|a;|) — h{x) times to avoid the 
timing problem [4,14]. Accept x if all the first h{x) runs of Mf reach accepting 
configurations; otherwise, reject x. □ 



3.4 Quantum Gap Functions. Notice that #QP is not closed under sub- 
traction. To compensate the lack of this property, we can introduce the quantum 
gap functions. A quantum gap function is defined to compute the difference 
between the acceptance and rejection probabilities of a well-formed QTM. 

Definition 5. A function f from S* to [—1,1] is called a polynomial-time 
quantum gap function with K -amplitudes if there exists a polynomial- time, K- 
amplitude, well-formed QTM M such that, for every x, f{x) is — 

II when M on input x halts in a final superposition |(/)i°^)|0) -I- |(/)i^^)|l), 
where the last qubit represents the content of the output tape of M . In other 
words, f{x) = 2pm{x) — 1. 

We use the notation GapQP^ to denote the collection of such functions. 

Definition 6. GapQP^ is the set of all polynomial-time quantum gap func- 
tions with K -amplitudes. 

^ A function from V* to N is polynomially bounded if there exists a polynomial p such 
that f{x) < p(|a;|) for every x. 



412 Tomoyuki Yamakami 



For any two sets T and Q of functions, the notation T —Q denotes the set of 
functions of the form f — g, where f € J- and g € G. The following proposition 
shows another characterization of GapQP^. Recall that {0,±1} C RT C C. 

Proposition 1. GapQP^ = ^QP^ — ^QP^. Thus, #QP^ C GapQP^. 

Proof. Clearly, GapQP^ C Conversely, assume that there 

exist two polynomial-time, RT-amplitude, well-formed QTMs Mg and Mh sat- 
isfying f{x) = pMg{x) — pMh{x) for all X. Consider the following QTM N. On 
input X, N first writes 0 and applies H. If |1) is observed, it simulates Mg. 
Let Og be its output. Otherwise, N simulates Mh. Let ah be its output. In the 
case where Og = 1 — ah = 1, N accepts x; otherwise, rejects x. The acceptance 
probability of N is exactly ^PMg{x) + ^(1 — pmh{^))- Thus, the gap 2pn{x) — 1 
is exactly pMg{x) — PMh(x), which equals /(x). □ 



4 Computational Power of Quantum Gap Functions 

In the previous section, we have introduced quantum gap functions. In this 
section, we discuss the computational power of these functions. 

4.1 Squared Function Theorem. It is shown in [9] that, for every GapP- 
function /, there exists a polynomial-time, well-formed QTM that accepts in- 
put X with probability for a certain fixed polynomial p. This implies 

that if / G GapP then Ax.2“^’*^l^l)/(x)^ € #QP. A slightly different argument 
demonstrates that if / G GapQP then e #QP- This is a characteristic 
feature of quantum gap functions. 

Theorem 2. (Squared Function Theorem) If f G GapQP then f^ G #QP; 
where f^(x) = (/(x))^ for all x. 

It is, however, unknown whether GapQP n [0, 1]^* C #QP. To show The- 
orem 2, we utilize the following lemma, whose proof generalizes the argument 
used in the proof of Proposition 1. See also Lemma 5 in [14]. 

Lemma 3. (Gap Squaring Lemma) Let M he a well-formed QTM that, on 
input X, halts in time Tix). There exists a well-formed QTM N that, on inputs x 
given in tape 1 and in tape 2 and empty elsewhere, halts in time 0(T(x)^) 

in a final superposition in which the amplitude of configuration |x)|l^^^^)|l) is 
2 pm{x) — 1, where the last qubit is the content of the output tape. 

Proof. Let M be the QTM given in the lemma. We define the desired QTM N 
as follows. First, N simulates M on input {x,l'^^^'>). Assume that M halts in a 
final superposition \(f) = C(x,y\y)\by), where y ranges all (valid) configurations 

of M and the qubit \by) represents the content of the output tape of M. Note 
that pm{x) = \<^x,y\^- After N applies the phase shift P to \by), we have 

the superposition \cj)') = Ej/:6„=i ly)|l) “ Ey:b„=o«^,yly)|0)- 



Analysis of Quantum Functions 413 



There exists a well-formed QTM that reverses the computation of M in 
time 0{T{xY) on input {x, [14]. Then, N simulates starting with \4>'). 

Note that if we apply to ji^) instead, then M^{\(j))) = More- 
over, the inner product of ](/)) and \(j)') is {4>W) = Y.y,by=i Wx,y\^, 

which equals 2pm{x) — 1. 

At the end, N outputs 1 (i.e., accept) if it observes exactly the initial con- 
figuration; otherwise, N outputs 0 (i.e., reject). Note that the acceptance prob- 
ability of N is exactly (A^(|(/)))|iV(|0'))). Since N preserves the inner product, 
we have (iV(|0))|N(|(/)'))) = ((f)\4)') = 2pm(x) - 1. □ 

Now we are ready to prove Theorem 2. 

Proof of Theorem 2. Let / be in GapQP and M a polynomial-time, well- 
formed QTM such that f{x) = 2 pm{x) — 1 for all x. It follows from Gap Squaring 
Lemma that there exists a polynomial-time, well-formed QTM N that halts in 
a final superposition in which the squared magnitude of the amplitude of |a;)|l) 
is {2pm{x) — 1)^, which clearly equals P{x). □ 



4.2 Approximation of Quantum Gap Functions. Quantum gap func- 
tions are closely related to their classical counterpart: gap functions [8]. The 
following proposition shows the close relationship between GapQP and GapP. 
Let sign{a) be 0, 1, and —1 if a = 0, a > 0, and a < 0, respectively. 



Theorem 3. 1. For every f G GapQPj^, there exists a function g G GapP 

such that, for every x, f{x) = 0 ijf g{x) = 0. 

2. For every f G GapQP and every polynomial q, there exist two functions 



k G GapP and i G FP such that 



k{x) 



< 2 9(1^1) for all X. 



fi^) ~ Z(iRy 

3. For every f G GapQP^p,^, there exists a function g G GapP such that 
sign{f{x)) = sign{g{x)) for all x. 

4-. For every f G GapQPjj, there exist two functions k G GapP and ^ G FP 
such that f{x) = for all x. 



To show the theorem, we need the following lemma. For a QTM M, let 
ampM{x, C) denote the amplitude of configuration G of M on input x in a final 
superposition. The (complex) conjugate of M is the QTM M* defined exactly 
as M except that its time-evolution matrix is the complex conjugate of the 
time-evolution matrix of M . 



Lemma 4. Let M he a well-formed, synchronous, stationary QTM in normal 
form with running time T(x) on input x. There exists a well-formed QTM N such 
that, for every x, (1) N halts in time 0(T(x)) with one symbol from {0, 1, ?} in its 
output tape; (2) ampN{x,C) = pm{x); and (3) amp]\[{x,C) = 

1 ~ Pm{x), where Dl,. is the set of all final configurations, of M on x, whose 
output tape consists only of symbol i G {0, 1}. 



414 Tomoyuki Yamakami 



Proof Sketch. The desired QTM N works as follows. On input x, N simulates M 
on input X] when it halts in a final configuration, N starts another round of 
simulation of M* in a different tape. After N reaches a final configuration, 
we obtain two final configurations. Then, N (reversibly) checks if both final 
configurations are identical. If not, N outputs symbol “?” and halts. Assume 
otherwise. If this unique configuration is an accepting configuration, then N 
outputs 1; otherwise, it outputs 0. □ 



Proof of Theorem 3. (I) First we note from [15] that, for every g G #QPp, 

there exists a, h G GapP such that, for every x, g{x) = 0 iff h{x) = 0. 

Let / be any function in GapPP^. By Squared Function Theorem, p be- 
longs to #QPp. Thus, there exists a h G GapP such that, for every x, P(x) = 0 
iff h{x) = 0. It immediately follows that f{x) = 0 iff h{x) = 0. 

(2) Let / G GapQP. By Lemma 4, it follows that there exists a polynomial- 
time, well-formed QTM M such that f{x) equals 

— where D], is defined in Lemma 4. Let r be a polyno- 

mial that bounds the running time of M and also satisfies U D]f\ < 

Let X be any string of length n. Let ^{x) = and 

g{x,C) = ampM{x,C). Assume first that there exists a 

GapQP-function g such that \g{x,C) — which implies 

J2ceDi9ix,C)-j:ceDi w| < • 2-’'(")-«(")-i = 2-«(")-i for each 

i G {0, 1}. The desired GapQP-function k is defined as k{x) = ~ 

'^CeDO 9{x, C). 

To complete the proof, we show the existence of such g. Note that every ampli- 
tude of the transition function of M is approximated by a polynomial-time TM. 
By simulating such a machine in polynomial time, we can get a 2r(n) -I- q{n) + 1 
bit approximation of the corresponding amplitude to within so 

that, for every computation path P of M on input x, we can compute the approx- 
imation pp of the amplitude pp of path P to within Let h{x,P) 

be the integer satisfying h{x,P) = £(l")|pp|. Set g{x,C) = 'P.p^p^ ^ h{x, P), 
where Px,c is the collection of all paths of M on a; that lead to configuration C. 

(3) This follows by a modification of the proof of Lemma 6.8 in [Ij. 

(4) In the proof of (2), since M has Q-amplitudes, we can exactly compute 

the amplitude pp of path P and thus, we have h{x,P) =£(l")|pp|. Therefore, 
f{x) = See also Theorem 3.1 in [11]. □ 



5 Quantum Complexity Classes 

Using the values of quantum functions, we can introduce a variety of quantum 
complexity classes. As in the previous sections, when K = C, we drop script K . 



Analysis of Quantum Functions 415 



5.1 GapQP-Definable Complexity Classes. What is the common feature 
of known quantum complexity classes? Bernstein and Vazirani [4] introduced 
EQPjf (exact QP) as the collections of sets S such that an FQP^-function / 
satisfies f{x) = S{x) for all x. Similarly, BQP^ (bounded-error QP) is defined 
by an FBQP^-function instead of an FQP^-function. Adleman et al. [1] in- 
troduced NQP^ (nondeterministic QP), which is defined as the collection of 
sets S such that there exists a #QP^-function / satisfying that, for every x, 
S{x) = 1 iff /(x) > 0. By a simple observation, EQP^, BQP^, and NQP^ 
can be all characterized in terms of CapQP^-functions. 

We introduce a general notion of GapQP-definability in similar spirit that 
Fenner et al. [8] defined Gap-definability. 

Definition 7. A complexity class C is GapQPic-definable if there exist a pair 
of disjoint sets A, R C S* x [0, 1] such that, for any S , S G C ijf there exists an 
f G CapQP^ satisfying that, for every x, (i) x G S implies (x, /(x)) G A and 
(a) X ^ S implies (x, /(x)) G R. We write GapQPxiA, R) to denote this C. 



Proposition 2. EQP^^, BQPj^, and NQP^^ are all GapQP k - definable. 

An immediate challenge is to prove or disprove that the following quantum 
complexity class is GapQP-definable: WQPj^ (wide QP), the collection of sets A 
such that there exist an / G and & g G FQP^ with range(g) C (0, 1] nQ 

satisfying /(x) = A{x) ■ g{x) for every x. Notice that we can replace #QP^ by 
CapQP^. By definition, EQP^ C WQP^ C NQP^. 

5.2 Sets with Low Information. Let C be a relativizable complexity class. 
Gonsider a set A satisfying = C. Apparently, this set A has low information 
since it does not help the class C gain more power. Such a set is called a low set 
for C. Let low-C denote the collection of all low sets for C. 

Obviously, low-EQP = EQP. Moreover, since BQpBQP ^ gQp 

, we 

obtain low-BQP = BQP. However, low-NQP does not appear to coincide with 
NQP since EQP C low-NQP C NQP n co-NQP. Thus, if NQP yf co-NQP, 
then low-NQP yf NQP. 

The following proposition is almost trivial with the fact that {0, ±1} C K. 

Proposition 3. EQP^^ = low-^QP^^ = low-CapQP^. 

5.3 Quantum Complexity Classes beyond BQP. We briefly discuss a 
few quantum complexity classes beyond BQP. 

We first note that any BQP set enjoys the amplification property: for every 
polynomial p, there exists a #QP-function / such that, for every x, if x G A then 
/(x) > 1 — and otherwise 0 < /(x) < [3]. Let AQP^ (ampli- 

fied QP) be defined similarly by replacing #QP with GapQP^^ . By definition, 
BQP C AQP. Note that if AQP yf BQP, then GapQP n [0,1]^* yf #QP. 
By Proposition 3, AQPq remains within AWPP, which is defined in [7]. 



416 Tomoyuki Yamakami 



Another natural class beyond BQP is PQP^ (proper QP) defined as 
GapQPK{A, A), where A = {(cc,r) | r > 0} and A is the complement of A. 
Proposition 3, however, yields the coincidence between PQPq and PP. 

It is important to note that AQPj- and PQP^ are no longer recursive since 
BQPc is known to be non-recursive [1]. 

6 Computation with Oracle Queries 

In this section, we study quantum functions that can access oracle sets. In what 
follows, r denotes an arbitrary function from N to N, i? a subset of and A a 
subset of E*. 

6.1 Adaptive and Nonadaptive Queries. We begin with the formal defi- 
nitions. 

Definition 8. A function f is in if there exists a polynomial- time, 

well-formed, oracle QTM M such that, for every x, M on input x computes 
f{x) using oracle A and makes at most r(|a;|) queries on each computation path. 
The class FQP"^^^^ is the union of such sets FQP^M over all r e R. 

Note that, when R = N^, FQP"^t^l coincides with FQP"^. 

Proposition 4. Let C be a complexity class that is closed under union, comple- 
ment, and polynomial-time conjunctive reducibility. If f G ^QP*' then Xx. 

G for a certain polynomial p. 

Proof Sketch. Assume that a given QTM M is of “canonical form”; that is, 
there exists a polynomial p such that M makes p{n) queries of length p{n) on 
every computation path for n the length of input [14]. Consider the following 
quantum algorithm. First guess future oracle answers {ai}i<i<p(n) and start 
the simulation of M using {ai}i<i<p(n) to answer actual queries 
When M accepts the input, make a single query {wi, • • • , Wp(jT), oi, • • • , ap^^n)) to 
oracle and verify the correctness of {oi}i<i<p(„). □ 

Next, we introduce quantum functions that make nonadaptive (or parallel) 
queries to oracles. The functions in Definition 8, on the contrary, make adaptive 
(or sequential) queries. 

Definition 9. The class FQPjj^^’'^ is the subset o/FQP"^^”^ with the extra condi- 
tion that, on each computation path p, just before M enters a pre-query state for 
the first time, it completes a query list^ — a list of all query words (separated by 
a special separator in a distinguished tape) that are possibly'^ queried on path p. 
The notation FQPjj^^^^ denotes the union of FQPjj^^*^^ over all r G R. 

^ When we say a “query list” , we refer to the list completed just before the first query. 
This query list may be altered afterward to interfere with other computation paths 
having different query lists. 

^ All the words in the query list may not be queried but any word that is queried must 
be in the query list. 



Analysis of Quantum Functions 417 



A similar constraint gives rise to and GapQPjj^^^^ 

The proof of Theorem 1 implies that ^ #TIME(o(2”))^ relative 

to a certain oracle A. 

Proposition 5. FQP^M C FQPjj^f’' for any function r in 0(logn). 

Proof. Let / G FQP"^^''^ and assume that a polynomial-time, well-formed 
QTM M witnesses / with at most r(|a;|) queries to A on each computation path 
on any input x. Consider the following quantum algorithm. 

Let X be any input of length n. We use a binary string a of length r(n). 
Initially, we set a = In the fcth round, 1 < A: < we simulate M on 

input X except that when M makes the ith query, we draw the fth bit of a as its 
oracle answer. In case where M makes more than r(n) queries, we automatically 
set their oracle answers to be 0. We record all query words in an extra tape. 
After M halts, we increment a lexicographically by one. After rounds, we 
have a query list of size at most We then simulate M again with 

using oracle A. This last procedure preserves the original quantum interference. 
Therefore, / e □ 

In particular, we have C FQPjj^*^^, which is analogous to 

ppNP[0(iogn)] ^ FPjf^. It is open, however, whether C FQP"^M. 

6.2 Separation Result. We show the existence of a set A that separates 
#QP^ from #QPjj^. For a string y and a superposition \cj>) of configurations, 
let qy{\4>)) denote the query magnitude [.3]; that is, the sum of squared magnitudes 
in \<f>) of configurations which has a pre-query state and query word y. 

Theorem 4. There exists a set A such that #QP"^ n {0, 1}^ ^ . 

Proof. Define f^{x) = 2“l^l • \{y G | A{x o yA) = 1}| for x G S* and 
A C A*, where yA = A{y0^y^)A{y0^y^-^l)A{y0^y^-Hl) ■ ■ ■ A{yl^y^). We call a 
set A good at n if, for any pair y,y' G A", \y\ = \y'\ implies yA = 2/^; A is good 
if A is good at every n. It follows that, for any good A, f^ G #QP n {0, 1}^*. 

We want to construct a good set A such that f^ ^ #QPjj^. Let 
and {pijigN be two enumerations of polynomial-time, well-formed QTMs and 
polynomials such that each Mi halts in time Pi{n) on all inputs of length n. We 
build by stages a series of disjoint sets and then define A = A^. 

For convenience, set A_i = 0. Consider stage i. Let Ui be the minimal integer 
such that Ui-i < Ui and 8pi(rii)^ < 2”L It suffices to show the existence of a 
set Ai C y; 2 rai (j^ 2 rai-i-i jg gQQgj Ai(0"‘ ^ Pm^' (0"*). 

For readability, we omit subscript i in what follows. To draw a contradiction, 
we assume otherwise. Let \4>j) be the superposition of M on input 0" at time j. 
For each y G A”, let fy be the sum of squared magnitudes in any superposition 
of M’s configurations whose query list contains word y. Let S be the set of all 
y G A" such that M on input 0” queries 0”y. In general, since 

the size of each query list is at most p(n). By our assumption, however, IS"! = 2”. 



418 Tomoyuki Yamakami 



Let y be any string in S and fix A such that y = yA- Moreover, let Ay be A 
except that Ay{0^y) = 1 — A(0"y). It follows by our assumption that = 

1 — p^A„(0"'). By Theorem 3.3 in [3], since |pm^(0”) ~ Pm^v{Q'^)\ = 1, we have 
Qyi\(l>j)) > 8^- Since EJ”i Qyi\<l^j)) < %-p{n), we have % > 

This immediately draws the conclusion that |5| < 8p(n)'^, a contradiction. □ 



Acknowledgments 

The author would like to thank Andy Yao and Yaoyim Shi for interesting discus- 
sion on quantum complexity theory. He is also grateful to anonymous referees 
for their critical comments on an early draft. 



References 

1. L. M. Adleman, J. DeMarrais, and M. A. Huang, Quantum computability, SIAM 
J. Comput, 26 (1997), 1524-1540. 414, 415, 416 

2. P. Benioff, The computer as a Physical system: A microscopic quantum mechanical 
Hamiltonian model of computers as represented by Turing machines, J. Stat. Phys., 
22 (1980), 563-591. 407 

3. C. H. Bennett, E. Bernstein, G. Brassard, and U. Vazirani, Strengths and weak- 
nesses of quantum computing, SIAM J. Comput., 26 (1997), 1510-1523. 410, 415, 
415, 417, 418 

4. E. Bernstein and U. Vazirani, Quantum complexity theory, SIAM J. Comput., 26 
(1997), 1411-1473. 407, 408, 408, 408, 409, 411, 415 

5. G. Brassard, P. Hpyer, and A. Tapp, Quantum counting, Proc. 25th International 
Colloquium on Automata, Languages, and Programming, Lecture Notes in Com- 
puter Science, Vol.1443, pp. 820-831, 1998. 410 

6. D. Deutsch, Quantum theory, the Church- Turing principle, and the universal quan- 
tum computer, Proc. Roy. Soc. London, A, 400 (1985), 97-117. 407, 408 

7. S. Fenner, L. Fortnow, S. Kurtz, and L. Li, An oracle builder’s toolkit, Proc. 8th 
IEEE Conference on Structure in Complexity Theory, pp. 120-131, 1993. 415 

8. S. Fenner, L. Fortnow, and S. Kurtz, Gap-definable counting classes, J. Comput. 
and System Sci., 48 (1994), 116-148. 408, 413, 415 

9. S. Fenner, F. Green, S. Homer, and R. Pruim, Determining acceptance possibil- 
ity for a quantum computation is hard for PH, Proc. 6th Italian Conference on 
Theoretical Computer Science, World-Scientific, Singapore, pp. 241-252, 1998. 412 

10. R. Feynman, Simulating Physics with computers. Intern. J. Theoret. Phus., 21 
(1982), 467-488. 407 

11. L. Fortnow and J. Rogers, Complexity limitations on quantum computation, Proc. 
13th IEEE Conference on Computational Complexity, pp. 202-209, 1998. 414 

12. L. Grover, A fast quantum mechanical algorithm for database search, Proc. 28th 
ACM Symposium on Theory of Computing, pp. 212-219, 1996. 410 

13. L. G. Valiant, The complexity of computing the permanent, Theor. Comput. Sci., 
8 (1979), 410-421. 408 

14. T. Yamakami, A foundation of programming a multi-tape quantum Turing ma- 
chine, Proc. 24th International Symposium on Mathematical Foundations of Com- 
puter Science, Lecture Notes in Computer Science, Vol.1672, pp. 430-441, 1999. See 
also LANL quant-ph/9906084. 408, 408, 408, 409, 411, 412, 413, 416 



Analysis of Quantum Functions 419 



15. T. Yamakami and A. C. Yao, NQPc=co-C=P, to appear in Inform. Process. Lett. 
See also LANL quant-pli/9812032, 1998. 414 

16. A. C. Yao, Quantum circuit complexity, Proc. 34th IEEE Symposium on Eounda- 
tion of Computer Science, pp. 352-361, 1993. 408 



On Sets Growing Continuously 



Bernhard Heinemann 



Fachbereich Informatik, FernUniversitat Hagen 
D-58084 Hagen, Germany 
phone: ++49-2331-987-2714 
fax: ++49-2331-987-319 
bernhard . heinemannOf ernuni-hagen . de 



Abstract. In the given paper we introduce a system by means of which 
the increasing of sets in the course of continuous (linear) time can be 
modelled. The system is based on a bimodal language. It originates from 
certain logics admitting topological reasoning where shrinking of sets is 
the subject of investigation, but is ‘dual’ to those in an obvious sense. 
After the motivating part we examine the system from a logical point of 
view. Our main results include completeness of a proposed axiomatisa- 
tion and decidability of the set of validities. The intended applications 
concern all fields of (spatio-)temporal reasoning. 



1 Introduction 

Subsequently we are concerned with the change of sets in the course of time. 
For the moment, this is a very general approach: changing sets actually occur 
in many fields of computer science. Let us mention three different examples. 
First, considering an agent involved in a multi-agent system, the set of states 
representing its knowledge changes during a run of the system. Thus, in order to 
specify its behaviour one should have at one’s disposal a tool by which one is able 
to treat changing sets (of this particular kind) formally. The significance of the 
knowledge-based approach to modelling multi-agent systems has been pointed 
out convincingly; see [6], e.g., or [10], where in particular distributed systems 
are emphasized. The language we introduce below in fact originates from this 
context, and a knowledge oprator is retained in it (but is mainly intended to 
quantify inside sets now). 

The second example stems from the realm of databases, where spatio-temporal 
modelling and reasoning is an actual field of research. There one might want 
to specify the temporal change of geometric shapes like bad-weather regions or 
coastlines, for instance. Applications to weather forecast and geological prog- 
noses, respectively, are obvious. 

Finally, spatio-temporal reasoning receives much attention in AI as well. Here 
the picture is rather heterogeneous, and several different approaches to this field 
have been proposed. To get an impression of the state of the art the reader may 
consult, for instance, the proceedings volume of the last ECAI conference [15] 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 420—431, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



On Sets Growing Continuously 421 



where a considerable number of papers has been contributed to the correspond- 
ing subdivision. 

In what follows the reader will find a unifying approach to those different view- 
points. Accordingly, we extract parts of their common ground and cast them into 
an appropriate logical system towards a corresponding laying of the foundations. 
To this end let us briefly describe our starting position. 

A system admitting topological reasoning ~ and being related to the logic of 
knowledge - has been developed recently [5]. It captures shrinking of sets and 
thus offers access to our topic in a certain sense. For convenience of the reader 
we mention its very basic features. Although originating from modal logic the 
domains of interpretation involved are not usual Kripke models, but set spaces 
(A, C>), where A is a non-empty set (of states, e.g.) and O is a set of subsets 
of A. In the simplest case there are two modalities included: one, designated K, 
which quantifies ‘horizontally’ over the elements of a set U G O, and another 
one, □, which quantifies ‘vertically’ over its subsets contained in O, expressing 
‘shrinking’ in this way. The modalities also interact. This interaction depends 
on the actual model, the class of semantic structures one has in mind, and it is 
a challenging task in general to describe it axiomatically. 

In this paper we dually deal with the growth of sets. Our idea is to develop a 
suitable ‘modal’ logic which modifies the D-operator of the logic of set spaces 
appropriately. Moreover, we want to consider frames in which sets increase con- 
tinuously, as it is mostly the case in real life. (The discrete case is treated 
elsewhere.) It turns out that this works in a way still preserving connections 
with distributed systems, the most important area of application of the logic of 
knowledge: our logic can be applied to synchronous multi-agent systems with no 
learning; see [11]. 

It should be remarked that there are different formalisms of computational logic 
dealing with dynamic aspects of sets as well: the logical treatment of hybrid sys- 
tems [1], for instance, or the duration calculus [12]. Their interesting relationship 
with the present approach deserves a closer examination. 

All in all, our aim is to provide a modal basis for the spatio-temporal reason- 
ing framework described above, which focuses on continuously increasing sets 
presently. In this respect our exposition is of a theoretical nature. What we de- 
velop subsequently is a bimodal system having a strong temporal flavour; in 
fact, it may be viewed as a generalization of the temporal logic of continuous 
linear time (see [9], §8). Although the use of modal logics of this kind is very 
common in computer science, systems related to the present one are scarce; 
see [7], [8], [13], [14] for some examples. The treatment of increasing sets and 
continuity in this context is new; note that the systems considered in [4] are 
different from ours. 

We now proceed to the technical details. In the next section we define the lan- 
guage underlying our system. In particular, we argue that set frames can be 
used for our task of modelling. Afterwards we introduce the logic and prove its 
soundness and completeness w.r.t. several classes of structures we have in mind. 



422 



Bernhard Heinemann 



In the final technical section we show that the set of theorems of the logic, i.e., 
the set of formulas which are derivable in the given calculus, is decidable. 

Except for some fundamentals of modal and temporal logic the paper is self- 
contained. Concerning this basic material we refer the reader to the standard 
textbooks [3], [2], and, in particular, [9]. All new techniques and constructions 
are brought in, but detailed proofs are omitted due to limited space. 



2 Prerequisites 

The definition of the syntax of CG starts at a suitable finite alphabet, which in 
particular enables one to define a recursive set of propositional variables, PV. 
The set J- of CG-formulas is the minimal set of strings satisfying 

PV C T, and a,(3 & P ^ a, Ka, □«, (a A /3) G P. 

The operator K is retained from the logic of knowledge [6] ; presently it is in- 
tended to quantify within sets. The second operator, □, captures increasing of 
sets by quantifying over all supersets of the actually considered set. As it is usual 
we let La := a and Oa := a. 

The idea to define the semantics of CG is as follows. We would like to describe 
the continuous growing of a given set, Y . Thus certain supersets of Y have to be 
considered in the formal model. Consequently, we take a universe, X, in which 
all these sets are contained, and the system of these sets, O, as the basic ingre- 
dients of interpreting formulas. Moreover, by means of a mapping a we assign a 
truth value to the propositions depending (only) on points (and not on sets; see 
below). Thus we take certain triples (X,0,a) as the semantic structures being 
relevant. These are specified precisely by the subsequent definition. 

Definition 1. 1. Let X he a non-empty set and O a set of non-empty subsets 

of X . Then the pair S = {X, O) is called a set frame. 

2. A set frame S is called densely ordered, iffO is linearly ordered by (reverse) 
proper set inclusion, and for all U,V G O such that U D V there exists 
W G O satisfying U D W D V . S is called rational, ijf (O, d) is isomorphic 
to (Q, >), and continuous, iff it is isomorphic to (K., >). 

3. Let S = (X,0) be a (densely ordered, rational, continuous) set frame and 
a : PV X X — !■ {0,1} a mapping. Then a is called a valuation, and the 
triple M = {X,0,cr) is called a (densely ordered, continuous) model (based 
on S). 

Note that we have three ‘rising’ qualities of continuity given by the order 
type of O; it turns out that they are not distinguishable modally. — We define 
next how to interpret formulas in models at situations of set frames, which are 
simply pairs x, U (designated without brackets mostly) such that x G U GO. 
Only the crucial cases are mentioned. 



On Sets Growing Continuously 423 



Definition 2. Let a model M = (X, O, a) and a situation x, U of the set frame 
{X, O) he given. Then we define for all A G PV and a € T : 

X, U \=M A : 4=^ o'{A, x) = 1 

X, U \=M Ka : 4=^ y, U \=m ot for all y G U 

X, U \=M : 4=^ X, V \=M a for allVDU eontained in O. 

Notice that the definition of the validity of a propositional variable A at a 
situation x, U is independent of U. This is in accordance with the proceeding 
in topological modal logic (see [5]) and enables us to define the semantics in a 
quite natural way via situations; over and above that we use this definition in 
the completeness proof decisively. 

In case x, \=m ol is valid we say that a holds in A4 at the situation x, C/; 
moreover, the formula a G J- is said to hold in A4 (denoted by \=m ct)i iff it 
holds in M. at every situation. If there is no ambiguity, we sometimes omit the 
index M.. 



3 The Logic 

In this section we present a logical system which permits of formally deriving 
all validities of the semantic domains we have in mind: models based on densely 
ordered, rational and continuous set frames, respectively. We list several axioms 
and a couple of rules below, constituting this system. Afterwards we show com- 
pleteness of the axiomatisation w.r.t. the just mentioned classes of structures. 
In this way we obtain in particular a generalization of the modal system K4DLX 
considered in [9], p. 56 f. As axioms we have for all A G PV and a,f3GP: 



(1) 


All P- 


-instances of propositional tautologies. 


(2) 


K{a - 


/3) — > {Ka - 


-^K(3) 


(3) 


Ka - 


> a 




(4) 


Ka — 


> KKa 




(5) 


La 


KLa 




(6) 




□A) A (^ A - 


> D^A) 


(7) 


□ (a - 


/3) ^ (□« - 


. □/?) 


(8) 


□a — > 


Oa 




(9) 


□a ^ 


□ □a 




(10) 


< 

□ 


□a ^ /3) V □ 


> 

□ 

i 


(11) 


OKa 


^ KOa. 





Let us give some comments on these axioms. Apart from the first and the last 
one - the first embeds propositional logic and the last will be discussed later 
on “ they fall into two groups apparently, each of which concerning one modal 
operator. The first group, given by the schemes (2) - (5), is well-known from the 
common logic of knowledge of a single agent; see [6]. In terms of modal logic, these 
S5 -axioms express the properties of refiexivity, transitivity and weak symmetry, 
respectively, of the accessibility relation of the frame under consideration. 



424 



Bernhard Heinemann 



The ‘transitivity axiom’ is also present for compare the left-to-right di- 
rection of the scheme (9). Axiom (10) corresponds with weak connectedness in 
this modal meaning; i.e., given arbitrary points s,t^u of a usual Kripke frame 
{X, R) such that sRt and s Ru, then t Ru or u Rt or u = t holds iff the scheme 
is valid in (X,R). In set frames it is responsible for linearity of the set O, in 
connection with (11). The scheme (8) tones down reflexivity to seriality, which 
means that for all s G A there is a t G X such that sRt. Finally, the right-to 
left direction of (9) encapsulates weak density; i.e., if s Rt, then sRu and u Rt 
holds for some u {s,t,u G X). — But the group of axioms involving only the 
□-operator comprises yet another peculiarity: the scheme (6). It allows us to 
define the semantics of propositional variables in the way we did above, namely 
without explicit reference to sets occurring in situations, but it implies that the 
system to be defined immediately is not closed under substitution. Regarding 
content (6) says that the atomic propositions are ‘stable’ or ‘persistent’ during 
increasing. In the paper the scheme also serves as an appropriate technical means 
in order to prove completeness. 

Last but not least axiom (11) combining both modalities is associated with the 
growth of sets. It is very powerful, as we shall see below. — By adding the 
following rules we get a deductive system designated CG. Let a,j3 G J-. 



( 1 ) 



f3,a 



(3 






( 3 ) ^ 



So, we have modus ponens, K -necessitation and U-necessitation. — Soundness 
of the system w.r.t. the structures introduced in Definition 1(3) can easily be 
established. 



Proposition 1. All of the above axioms hold in every model, and the rules 
preserve validity. 



We are going to sketch how completeness of the system CG w.r.t. the class of 
densely ordered models is proved. Rational and continuous models^re considered 
in a separate section afterwards. We start at the canonical model Xi of CG. This 
is built in the usual way (see [9], §5); i.e., the domain C of Xi consists of the set 
of all maximal CG-consistent sets of formulas, and the accessibility relations 
induced by the modal operators K and □ are defined as follows: 



{a G T \ Ka G s} Ct, s — >t 



{a G T \ Oa G s} C t, 



for all s,t G C. Finally, the distinguished valuation of the canonical model is 
defined by a{A,s) = 1 : 4=^ A G s {A G PV, s G C). — The subsequent 
truth lemma is well-known. 

Lemma 1. Let us denote the usual satisfaction relation of multimodal logic by 
\=, and let h designate CG~deriv ability. Then it holds that 

(a) Xi \= a[s] iff a G s, and (b) Xi\=aiff\~ a, for all a G T and s G C. 



On Sets Growing Continuously 425 



Parts (a) and (b) of the following proposition are likewise commonly known. 
Axioms (3) - (5) are responsible for (a), whereas (8) - (10) imply (b). Part (c) 
is a consequence of the scheme (11). Its proof is not quite immediate, but can 
be done in a similar fashion as that of [5], Proposition 2.2. 

Proposition 2. (a) The relation — ^ is an equivalence relation on the set C. 

(b) The relation on C is serial, transitive, weakly dense and weakly con- 
nected. 

(c) Let s,t,u £ C he given such that s — > t — > u. Then there exists a point 
V £ C satisfying s -^v — 

Following a common manner of speaking in the language of subset spaces 
let us call the property asserted in (c) the modified cross property. — The next 
proposition reads as [8], Proposition 9(3), and is crucial to our purposes. 

Proposition 3. Let s,t £ C be given such that s -^t and s -—^t holds. Then s 
and t coincide. 

Later on the following consequence of this proposition and the modified cross 
property will be applied. 

Corollary 1. Let s,t £ C be given such that s — ^ t and s — ^ s holds. Then 
also t -^t is valid. 

The next result can be proved inductively with the aid of Proposition 3. 
Proposition 4. The relation on C is antisymmetric. 

For every s S C let [s] denote the —^-equivalence class of s. A relation ^ 
on the set of all such classes is defined as follows: 

[s] ^ [t] : 4=^ there are s' £ [s], t' £ [f] such that s' ~^t', 

for a\\ s,t £ C . — As a consequence of the above assertions and the modified 
cross property we get: 

Proposition 5. The relation -< is serial, transitive, weakly dense, weakly con- 
nected and antisymmetric. 

Later on we argue that we may restrict attention to the submodel of the 
canonical model generated by a suitable s £ C which has carrier 

C* = {[s]}U U [t] 

tec,[s]^[t] 

and accessibility relations the restricted ones. So, let us proceed with this model 
which we likewise designate A4, abusing notation. 

In order to arrive at a densely ordered model eventually we have to ensure that 
density is not caused by reflexivity at certain points. This is done by ‘blowing 



426 



Bernhard Heinemann 



up’ the model appropriately, substituting every ‘reflexive point’ by a copy of Q. 
To this end we let ■.= {t & C“ \ t ~^t}, deflne the set 

C' := (C" \ C^) U {(r, t) I r e Q and t G C^}, 
and binary relations ^ on C" by letting for all x,y G C' 



L 

X : 



x,y G C'^ and x — or x = {r,t),y = (r, t') and t — ^ 



for some r G Q and t, t' G C®, and 



o 

X ^y : 



x,y G and x ~^y, or x = (r,t), y G C® and t ~^y, or 
a; G C® , y = (r,t) and x t, or 
x = (r,t), y = and t ~^t' , or 

= (r,t), y = (r', t) and r <r' , 



for some r,r' G Q and t, t' G C“ . Then all the properties stated in Proposition 2 
hold for and respectively. Moreover, is irreflexive, i.e., a; is not 
valid for any x G C . 

Proposition 6. (a) The relation ^ is an equivalence relation on the set C . 

(b) The relation ^ on C is serial, irreflexive, transitive, dense and weakly 
connected. 

(c) Let s,t,u G C he given such that s ^ t ^ u. Then there exists a point 
V G C satisfying s ^u. 



Designating the relation induced by ^ on the set of all ^-equivalence classes 
■<' and taking advantage of Corollary 1 (among other things) we get: 

Proposition 7. The relation is a dense linear order. 

The distinguished valuation on the canonical model is lifted to C by 



f cr{A, x) if X G C® 

I cr(A, t) if X = (r, t) for some r G Q and t G , 



for all A G PV and x G C . Finally, let h denote the canonical mapping from C 
onto C and O' the preimage of O w.r.t. h. The subsequent lemma, in which 
M' := {C , O' , a'), is easily proved by a structural induction then. 

Lemma 2. For all a G P and x G C we have M' |= a[x] iff M \= a[h{x)]. 

Now we are in a position to deflne a densely ordered model M falsifying 
a given non-derivable formula a G F . For this purpose we choose a maximal 
CG-consistent set s G C containing and consider both the submodel A4 of 
the canonical model generated by s and the model A4' depending on A4 that 
has been just constructed. Let sq := s, if s G C , and sq := (0, s) otherwise. For 




On Sets Growing Continuously 427 



every a; G C' let [x] designate the ^-equivalence class of x (this notation will 
not be confused with that one of the same kind on C), define for all y & C 

[a:] [y] ■■ [x] = [y] or [a;] -<' [y], 

and let C(, := {[y] I y G C' and [a;] [y]}. Furthermore, let C"^° := (J [y] 

and define a function : C(, — > C'^° for every x G C such that [ a: ] G by 

^ if[a:'] = [a;] 

I a;i G [a:'] satisfying a; “^a;i otherwise, 



for all [a;'] G C(,. According to our previous results every function fx is well- 
defined. The set A := {fx \ [x] G C(^} will serve as the carrier set of M. 
Moreover, for every [a;] G we let 

U[x] ■= {fy\y^ [a;]} and U[x] '■= IJ 

[so ]d:'[x] 



Subsequently we write O instead of {[/[ a, j | [ a; ] G } . Finally, we let a valuation 
(T on A be induced by cr' in an obvious sense: a{A, fx) = 1 : <1=^ tr'(A, x) = 1, 
for all propositional variables A and functions fx G A. Note that a is correctly 
defined as well. — The structure M := (A, O, a) is a densely ordered model by 
construction, and the following truth lemma is valid. 

Lemma 3. For all formulas /3 G A and x,x' G C'®“ such that [x] [x'\ it 

holds that fx,U[x>] /? iff M' \= f3[fx{[x'])]. 

As an immediate consequence we get the first of our desired completeness 
results. 

Theorem 1. Every a & T that is not CG~derivable can be falsified in a densely 
ordered model. 

4 Continuity 

Just as it is the case with the modal system K4DLX mentioned earlier we are 
able to prove completeness of the system CG w.r.t. the smaller classes of rational 
and continuous models simultaneously. But we have to develop new techniques 
for this because we cannot work with filtrations as in classical modal logic. This 
is caused by the failure of connectedness of any filtration of — > which is due to 
the fact that a generated submodel of the canonical model is ‘two-dimensional’ 
in essence. — The following notions are preparatory. 

Definition 3. Let I := (/, <) be a non-empty linearly ordered set. 

1. A subset % ^ J C I is called a segment of 2, iff there is no i G I \ J strictly 
between any two elements of J. 




428 



Bernhard Heinemann 



2. A partition of I into segments is called a segmentation of 2. 

3. A segmentation of T is called appropriate, iff every segment J of 2 either 
consists of a single point or is right-open, i.e., for all j G J there is a f j 
in J such that j < j' . 

Subsequently we will have to consider segmentations of the linearly ordered 
set C := {Cs, ^), where C* := {[s]} U {[t] \ t G C and [s] ^ [t]} and 

[t] ^ [m] : 4=^ [t ] = [u] or [t ] ^ [m], for alH , u G C^, 

such that the truth value of a given formula remains unaltered on every segment. 
(Note that both and ^ have been defined in Section 3; moreover, ^ is in fact 
a linear ordering because of Proposition 5.) The next definition says what this 
means precisely. 

Definition 4. Let M = (C^,{— ^ |c«xC«}jCr) be the submodel of 

the canonical model of CG generated by s G C , I an indexing set and V := {Vi \ 
L G 1} a segmentation of C. 

1. For every i G I we define an equivalence relation on Vi by 

f there is some z G IjT^t such that x -^z and y -^z, 
x^,y. ^ 

{orx = y, 

for all x,y G [JV^. Every ^^-class is called a cone of Vi, and the set of all 
cones ofVi is designated S^. 

2. Let a G T be a formula. Then a is called stable on V, iff for all l G I and 
cones S G Sl 

Ai 1= a[y] for all y G S, or Ai \= ^ a[y] for all y G S. 

The relation is in fact transitive because is transitive and weakly 
connected, by Proposition 2(b). — We can always achieve a finite appropriate 
segmentation of C on which a given formula is stable. 

Proposition 8. Let a G T be a formula and C as above. Then there exists a 
finite appropriate segmentation Va '■= {Ti, . . . ,Vn\ of C such that a is stable 
on Va. Moreover, Va can be chosen such that it refines Vp for every subformula j3 
of a. 

It should be remarked that proving the proposition one proceeds inductively 
and starts with the trivial segmentation {C} in case a a propositional variable. 
Only the cases a = (/SAy) and a = K(3 contribute to a refinement of the actually 
obtained segmentation. 

Now we define an intermediate model of ‘finite depth’ depending on a given 
formula a. We let Va ■= {Vi, . . . , Vn} be the finite segmentation of C according 
to Proposition 8 on which a is stable and X := {S' | S' G 5^ for some 1 < z < n}. 
Furthermore, we define binary relations i— ^ on X by 

S : 4=^ for some i G (1, . . . , n} both S and T are cones of Vi 

S : 4=^ there exist x G S and y GT such that x ~^y, 




On Sets Growing Continuously 429 



for all S,T G X. Finally, we let a valuation r on X be defined by r(A, S') := 
x) for some x G S, for all A G PV and S G X. Then we have the following 

Lemma 4. The structure A4 := (X, {i— ^ }, r) is a model such that 

(a) all properties stated in Proposition 2 and Proposition 4 olso valid for 

and respectively, and 

(b) for all subformulas (3 of a, points t S C® and cones S G X containing t it 
holds that A4 ^ j3[t] iff M \= /3[S]. 



The model M can be ‘vertically decomposed’ into slices, i.e., sequences 



H : S„_ 



n— m+1 



■s„_ 



n— m+2 



of maximal length such that Si G Si for alln — m + l<i<n (m g N, m < n). 
Now let S be of length n and Si-,, , Si,^ the reflexive cones in S' {ii, . . . ,ik G 
Consider any right-open interval J = [r,p) of either Q or R (p 
may be oo). Decompose J into k right-open subintervals Ji = , Jfc = 

[I'kjPk) (in ascending order). Because of the next lemma one can p-morphically 
map J onto S, thereby assigning subintervals to reflexive cones and right end- 
points to a following non-reflexive cone, if need be (see [9], p. 57). 

Lemma 5. For any slice of M, two non-reflexive cones are non-adjacent. 

Since the segmentation of C was chosen to be appropriate, all cones of fixed 
index i G {1, . . . ,n} are either reflexive or irreflexive. So, we may define a bimodal 
model A4' in which the equivalence classes belonging to the operator K are 
indexed by J; moreover, the above p-morphism extends suitably to that model. 
To be more precise, let for every j G {1, . . . , n} 

■Ai if j = for some 1 < I < k and (j = 1 or j — 1 = i;_i) 

i'l'ii , Qii) if j = ii for some 2 < I < k and < j — 1 

Pi, ii j — 1 = ii for some 1 < I < k and j yf 

ri ii j = 1 and j ^ i\. 

Note that exactly one of the cases on the right-hand side occurs, according to 
our discussion above. Using this we define 

X' := {(g. S') I g G Ji and S G Sj for some 1 < j < n}. 

Furthermore, let accessibility relations on X' be given by 

(g, S) ^ (g', S') : 4=^ y = <]' and S i— ^ S' 

(g, S) (g', S') : 4=^ q < q' and S S', 




for all g,g' G J and S, S' G X. Letting Anally the valuation of A4' be the one 
induced on X' by that of A4 we obtain: 



430 



Bernhard Heinemann 



Lemma 6. The relations and fulfill all the respective properties stated 
in Proposition 6, and for all P G J- and {q, S) € X' we have A4' |= P[{q, S')] iff 
M^P[S]. 

The interval J (together with the restriction of < to J) is nothing but a 
generated substructure of (Q, <) (and of (K, <), respectively). Thus, by the 
standard submodel lemma of modal logic ([9], 1.7, e.g.), we may assume that q 
varies over Q in the above lemma (and over R, respectively). 

The situation now is obviously the same as that after Lemma 2, except for 
the fact that the set indexing the -equivalence classes equals Q here (K, 
respectively). Consequently, carrying out the construction of the final model as 
a suitable space of functions in the same way as above we get: 

Theorem 2. The system CG is complete w.r.t. the classes of rational and con- 
tinuous models, respectively. 

5 Decidability 

Our examination of the previous section also leads to decidability of the set 
of formulas holding in every densely ordered (rational, continuous) model and, 
equivalently, of the set of CG-derivable formulas. Our starting point now is 
Lemma 4, from which the following definition is derived. 

Definition 5. Let M. := {W, {i?, S}, a) be a bimodal Kripke model; i.e., W is a 
non-empty set, R and S are binary relations on W , and a is a valuation. Then 
j\4 is called a CG-model, iff 

— R is an equivalence relation on W , 

— S serial, antisymmetric, transitive, weakly dense and weakly connected, 

— for all s,t,u G W: if s Rt and tSu, then there exists an element v € W 
such that s S v and v Ru, 

— for all s,t G W such that s St and every A G PV it holds that M. |= ^[s] ijf 
M ^A[t]. 

In the above definition the relation R corresponds with the modality K; 
accordingly, S and □ are related. — It turns out that the system CG is sound 
and complete w.r.t. the class of CG-models as well. 

Theorem 3. A formula a G T is CG-derivable, iff it holds in every CG-model. 

Proving soundness of the system w.r.t. CG-models is straightforward, but 
concerning completeness one has to utilize Theorem I. 

Because of this theorem it suffices to consider CG-models in order to falsify 
a given formula a that is not derivable in the system CG. Revisiting Lemma 
4 shows that even CG-models of ‘finite depth’ are sufficient for this purpose. 
But we can go one step further: we may in fact confine ourselves to finite CG- 
models. We do not carry out this in detail presently, but mention that one can 
use standard methods of ‘topological’ modal logic to this end; see [8], e.g. — All 
in all, this gives decidability of our logic. 

Theorem 4. The set of formulas derivable in the system CG is decidable. 



On Sets Growing Continuously 431 



6 Prospect 

The given system describes formally the continuous growth of sets in a modal 
setting. As it stands, it is a very general framework, last but not least due 
to the fact that it deals with the propositional case only. However, because of 
the results obtained so far we feel that it could be a good basis for qualitative 
(spatio-)temporal reasoning in contexts where such continuous phenomena have 
to be modelled. 

It was already indicated in the introduction that a discrete version of the logic has 
been worked out and will appear elsewhere. Incorporating among other things 
the common nexttime operator, it generalizes propositional temporal logic of lin- 
ear time correspondingly. Apart from this a formalism expressing both increasing 
and shrinking of sets, and their acting in combination, is desirable. If one con- 
fines oneself to nexttime one can get such a system which is clearly rather weak. 
So, further research has to be concerned with more expressive logical languages 
being tailor-made for the present context. 



References 

1. Artemov, S., Davoren, J., Nerode, A.: Topological Semantics for Hybrid Systems. 
Lecture Notes in Computer Science, Vol. 1234 (1997) 1-8 421 

2. Chagrov, A., Zakharyaschev, M.: Modal Logic. Oxford (1997) 422 

3. Chellas, B. F.: Modal Logic: An Introduction. Cambridge (1980) 422 

4. Davoren, J.: Modal Logics for Continuous Dynamics. PhD dissertation, Cornell 
University (1998) 421 

5. Dabrowski, A., Moss, L. S., Parikh, R.: Topological Reasoning and The Logic of 
Knowledge. Annals of Pure and Applied Logic 78 (1996) 73-110 421, 423, 425 

6. Fagin, R., Halpern, J. Y., Moses, Y., Vardi, M. Y.: Reasoning about Knowledge. 
Cambridge(Mass.) (1995) 420, 422, 423 

7. Georgatos, K.: Knowledge Theoretic Properties of Topological Spaces. Lecture 
Notes in Computer Science, Vol. 808 (1994) 147-159 421 

8. Georgatos, K.: Knowledge on Treelike Spaces. Studia Logica 59 (1997) 271-301 
421, 425, 430 

9. Goldblatt, R.: Logics of Time and Computation. Stanford (1987) 421, 422, 423, 
424, 429, 430 

10. Halpern, J.Y., Moses, Y.: Knowledge and Common Knowledge in a Distributed 
Environment. Journal of the ACM 37 (1990) 549-587 420 

11. Halpern, J.Y., Vardi, M.Y.: The Complexity of Reasoning about Knowledge and 
Time. I. Lower Bounds. Journal of Computer and System Sciences 38 (1989) 
195-237 421 

12. Hansen, M. R., Chaochen, Z.: Duration Calculus: Logical Foundations. Formal 
Aspects of Computing 9 (1997) 283-330 421 

13. Heinemann, B.: A Topological Generalization of Propositional Linear Time Tem- 
poral Logic. Lecture Notes in Computer Science, Vol. 1295 (1997) 289-297 421 

14. Heinemann, B.: Separating Sets by Modal Formulas. Lecture Notes in Computer 
Science, Vol. 1548 (1999) 140-153 421 

15. Prade, H. (ed.): ECAI 98. 13th European Conference on Artificial Intelligence. 
Chichester (1998) 420 



Model Checking Knowledge and Time in 
Systems with Perfect Recall* 
(Extended Abstract) 



Ron van der Meyden^ and Nikolay V. Shilov^ 

^ School of Computer Science and Engineering, 
University of New South Wales, Sydney 2052, Australia. 
meyden@cse . unsw . edu . au 
^ Institute of Informatics Systems, Novosibirsk 
6, Lavrent’ev av., Novosibirsk, 630090, Russia 
shilovOiis .nsk. su 



Abstract. This paper studies model checking for the modal logic of 
knowledge and linear time in distributed systems with perfect recall. 
It is shown that this problem (1) is undecidable for a language with 
operators for until and common knowledge, (2) is PSPACE-complete 
for a language with common knowledge but without until, (3) has non- 
elementary upper and lower bounds for a language with until but without 
common knowledge. Model checking bounded knowledge depth formulae 
of the last of these languages is considered in greater detail, and an 
automata-theoretic decision procedure is developed for this problem, that 
yields a more precise complexity characterization. 



1 Introduction 

Modal logics have been found to be convenient formalisms for reasoning about 
distributed systems [MP91], in large part because such logics enable automated 
verification by model checking of specifications [CGP99]. This involves construct- 
ing a model of the system to be verified, and then testing that this model satisfies 
a formula specifying the system. Frequently, the model of the system is finite 
state, but model checking of infinite state systems is an emerging area of research. 

Epistemic logic, or the logic of knowledge [HM90,FHMV95], is a recent ad- 
dition to the family of modal logics that have been applied to reasoning about 
distributed systems. This logic allows one to express that an agent in the system 
knows (has the information) that some fact holds. This expressiveness is particu- 
larly useful for reasoning about distributed systems with unreliable components 
or communication media. In such settings, information arises in subtle ways, 
and it can be difficult to express the precise conditions under which an agent 

* Work supported by an Australian Research Council Large Grant, done while the 
authors were employed in the School of Computing Sciences, University of Tech- 
nology, Sydney. The first author acknowledges the hospitality of the Department of 
Computer Science, Utrecht University while revising this paper. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 432—445, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



Model Checking Knowledge and Time in Systems with Perfect Recall 433 



has certain knowledge. On the other hand, the behavior of agents is often a sim- 
ple function of their state of knowledge. Examples of knowledge-level analysis of 
systems illustrating this claim are given in [FHMV95]. 

A topic of interest for logics of knowledge is the extent to which they (like 
other modal logics) allow for automated analysis of designs and specifications. 
Combinations of temporal and epistemic logics are especially significant, since a 
frequent concern in applications is how knowledge changes over time. A number 
of papers have studied the problem of model checking logics of knowledge and 
time in finite state systems [HV91,FHMV95,Var96]. However, much of the lit- 
erature on applications of logics of knowledge assumes that agents have perfect 
recall, i.e., remember all their past states, and this results in infinite state sys- 
tems. Model checking of the logic of knowledge with respect to the perfect recall 
semantics has been considered by van der Meyden [Mey98], but this work deals 
with a language that does not include temporal operators. 

In the present paper, we study model checking a combined logic of knowledge 
and linear time in synchronous systems with perfect recall. Like van der Mey- 
den [Mey98], we assume that agents operate in a finite state environment, but 
we extend this framework to allow Biichi fairness constraints. Since the perfect 
recall assumption generates an infinite Kripke structure from the environment, 
the problem we study is an example of infinite state model checking. 

Formal definitions of synchronous systems with perfect recall and the model 
checking problem are presented in Section 2. While model checking a logic with 
operators for knowledge and common knowledge is decidable [Mcy98], the ad- 
dition of the linear time temporal operators next and until makes the problem 
undecidable (Section 3). However, decidability is retained for two fragments of 
this extended language: the fragments in which we (1) omit the until operator 
(this case is PSPACE complete) or (2) omit the common knowledge operator 
(this case is non-elementary) . The latter result may be obtained by means of 
reductions to and from various powerful logics already known to be decidable 
(weak SIS and Chain Logic with an equal level predicate [Tho92]). However, we 
also present (in Section 4) an alternative proof of this latter result, using novel 
automata-theoretic constructions, that provides a more informative complexity 
characterization. Section 5 discusses related work and topics for further research. 

2 Basic Definitions 

This section further develops the definition of environments of [Mey98] by adding 
fairness constraints, and defines the model checking problem we study. 

Let Prop be a set of atomic propositional constants, n > 0 be a natural 
number and O be a set. Define a finite interpreted environment for n agents to 
be a tuple E of the form (S', I, T, O, tt, a) where the components are as follows: 

1. S is a finite set of states of the environment, 

2. / is a subset of S, representing the possible initial states, 

3. T C S^ is a transition relation. 



434 Ron van der Meyden and Nikolay V. Shilov 



4. O is a tuple (Oi, . . . , 0„) of functions, where for each z G {1 . . rz} 

the component Oi : S ^ O is called the observation function of agent i, 

5. 7T : S' ^ {0, is an interpretation, 

6. a C S is an acceptance condition. 

Intuitively, an environment is a finite-state transition system where states encode 
values of local variables, messages in transit, failure of components, etc. For states 
s, s' the relation sT s' means that if the system is in state s, then at the next 
tick of the clock it could be in state s'. If s is a state and z an agent then Oi{s) 
represents the observation agent z makes when the system is in state s, i.e., the 
information about the state that is accessible to the agent. The interpretation 
maps each state to an assignment of truth values to the atomic propositional 
constants in Prop. The acceptance conditions are standard Biichi conditions 
which are used to model fairness requirements on evolutions of the environment. 

A trace of an environment E is & finite sequence of states sqSi . . . Sm such 
that So G I and Sj T Sj+i for all j < m. A run of an environment E is an infinite 
sequence r : N ^ S' of states of E such that every finite prefix of r is a trace 
of E and there exists a state s G a that occurs infinitely often in r. We say that 
the acceptance condition of E is trivial if a = S. A point of if is a tuple (r, m), 
where r is a run of E and m a natural number. Intuitively, a point identifies a 
particular instant of time along the history described by the run. 

Individual runs of an environment provide sufficient structure for the in- 
terpretation of formulae of linear temporal logic. To interpret formulae involv- 
ing knowledge, we use the agents’ observations to determine the points they 
consider possible. There are many ways one could do this. The particular ap- 
proach used in this paper models a synchronous perfect-recall semantics of knowl- 
edge. Given a run r of an environment for rz agents with observation func- 
tions Oi, ... , On, we define the local state of agent i at time rzz > 0 to be the 
sequence ri{m) = Oz(r(0)) . . . Oi(r(m)). That is, the local state of an agent at 
a point in a run consists of a complete record of the observations the agent 
has made up to that point. These local states may be used to define for each 
agent z a relation of indistinguishability on points {r,m),{r' ,m') of E, by 
{r,m) (r',m') if ri{m) = r'(rzz'). Intuitively, when (r,m) (r',m'), agent z 

has failed to receive enough information at these points to determine whether it 
is in one situation or the other. Clearly, each is an equivalence relation. The 
use of the term “synchronous” above reflects the fact that if (r, rzz) {r',m'), 
then we must have rzz = rzz'. The relations will be used to define the se- 
mantics of knowledge for individual agents. We will also consider an operator 
for common knowledge, a kind of group knowledge, for which we use another 
relation. If G C {1 . . rz} is a group of agents (i.e., two or more) then we define 
the relation on points to be the reflexive transitive closure of the union of 
all indistinguishability relations for i G G, i.e., = (Uigg ~i)*- 

We will be concerned with model checking a propositional multi-modal lan- 
guage for knowledge and linear time based on a set Prop of atomic propositional 
constants, with formulae generated by the modalities Q (next), U (until), a 
knowledge operator Ki for each agent z G {1 . .rz}, and a common knowledge 




Model Checking Knowledge and Time in Systems with Perfect Recall 435 



operator Cq for each group of agents G C {1 . ,n}. Formulae of the language 
are defined as follows: each atomic propositional constant p S Prop is a formula, 
and if ip and ijj are formulae, then so are ~^ip, A'i/', Q)if, ipU ij), Kiip and Ccip for 

each i G {1 . .n} and group G C {1 . . n}. We write C{q,u,Ki,...,k„,C} for the set 
of formulae. We will refer to sublanguages of this language by a similar expres- 
sion that lists the operators generating the language. For example, C{Ki,....Kn,C} 
refers to the language of the logic of knowledge (without time) . As usual, we use 
the abbreviations Oip for trueG ip, and Dp for -^O^ip. 

The semantics of this language is defined as follows. Suppose we are given an 
environment E with interpretation tt. We define satisfaction of a formula p at a 
point (r, to) of a run of E, denoted E, (r, to) |= ip, inductively on the structure 
of ip. The cases for the temporal fragment of the language are standard: 



E,{r,m) \=p 
E, {r,m) ^ A <p 2 
E,{r,m) h 
E,{r,m) ^ Qip 
E, {r,m) \= ipiU ip 2 



if 7T{r{m)){p) = 1, where p € Prop, 
if E, (r, to) 1= ipi and E, (r, to) \= ip 2 , 
if not E, (r, to) |= ip, 
if E, (r, m+l)\= ip, 

if there exists to" > to such that E, (r, to") |= ip 2 
and E, (r, m') ^ ip\ for all m' with m < m' < to". 



The semantics of the knowledge and common knowledge operators is defined by 



E, (r, to) 1= Kiip if E, {r' , to') |= ip for all points (r', to') of E 
satisfying (t',to') {r,m) 



E, (r, to) 1= Ccip if E, (r', to') |= ip for all points (r', to') of E 
satisfying {r' ,m') {r,m) 

This definition can be viewed as an instance of the general framework for the 
semantics of knowledge proposed in [HM90] . Intuitively, an agent knows a for- 
mula to be true if this formula holds at all points that the agent is unable to 
distinguish from the actual point. Common knowledge may be understood as 
follows. For G a group of agents, define the operator Eq, read “everyone in G 
knows” by EqP = AigG Then Cap is equivalent to the infinite conjunc- 
tion of the formulae EqP for A: > 1. That is, is common knowledge if every- 
one knows p, everyone knows that everyone knows p, etc. We refer the reader 
to [HM90,FHMV95] for further motivation and background. 

We may now define the model checking problem we consider in this paper. 
Say that a formula p is realized in the environment E if for all runs r of E, we 
have E, (r, 0) \= p. We are interested in the following problem, which we call the 
realization problem: given an an environment E and a formula p of a language C, 
determine if p is realized in E. We will consider this problem with respect to 
several sublanguages of 



3 Complexity Bounds 

We now present a number of results on the complexity of the realization prob- 
lem for various fragments of the language, and briefly sketch their proofs. First, 



436 Ron van der Meyden and Nikolay V. Shilov 



we consider the most expressive language containing all the 

modal operators we have defined. Here the outcome of our investigation is neg- 
ative: 

Theorem 1. There exist a class of finite environments for two agents with triv- 
ial acceptance conditions and a formula of C^q k Xi,...,k„,C} such that it is un- 
decidable whether the formula is realized in a given environment of this class. 

That is, even a restricted case of the realization problem for C^q k j^^ x^ c} 

is undecidable. The proof of Theorem 1 employs ideas from concerning model 
checking at a trace for the language Stated in terms of our current 

terms and notations, this problem is to determine, given an environment E with 
a trivial acceptance condition, a trace t of E and a formula ip of C^Ki,...,Kn,C}^ 
whether E,{r,\t\) |= (p for all runs r of E extending t.^ This model check- 
ing problem was studied in [Mey98] for both the synchronous and an asyn- 
chronous perfect recall semantics of knowledge. The following two results are 
proved in [Mey98].^ 

Theorem 2. With respect to the synchronous perfect recall semantics, the prob- 
lem of model checking at a trace is in PSPACE for the language C{Ki,...,Kn,c}- 
It is PSPACE hard for the fixed formula C^i 2 }P- 

Theorem 3. There exists an environment for two agents such that with respect 
to the asynchronous perfect recall semantics, the problem of model checking the 
fixed formula C^i 2 }P at a given trace of the environment is undecidable. 

The proof of the lower bound in Theorem 2 involved showing that the syn- 
chronous semantics can simulate PSPACE computations, with Turing machine 
configurations represented as traces and the step relation on configurations rep- 
resented by the composition ~i o ~ 2 - The common knowledge operator then 
allows us to represent the transitive closure of the step relation, enabling a for- 
mula to refer to the result of a PSPACE computation. The proof of Theorem 3 
used a similar representation of Turing machine computations, but first uses 
asynchrony to “guess” the amount of space required by the computation. To 
prove Theorem 1 we reuse this approach to representation of Turing machine 
computations. However, instead of asynchrony, we now use the temporal opera- 
tors to refer to a sufficiently long configuration. As before, we then describe the 
outcome of the computation starting at that configuration using the common 
knowledge operator. 

In the language T{o, we have two operators, the until operator 
and the common knowledge operator, whose semantics allows an arbitrary reach 
through two orthogonal dimensions in our semantic structures. In other contexts, 
these operators are individually tractable, e.g., the validity problem for both 

^ It can be shown that if r and r' are two runs extending t and p € 
then E, (r, |t|) \= p iS E, (r', |t|) ^ P- 

^ In both these results {1,2} is a set of agents while p is a propositional constant. 



Model Checking Knowledge and Time in Systems with Perfect Recall 437 



the logic of knowledge and common knowledge and temporal logic 

£{ 0 ,w} are known to be decidable. It therefore makes sense to study the result of 
eliminating one of these operators from our language. For the language obtained 
by excluding the until operator we have: 

Theorem 4. The realization problem for "i-s PSPACE complete. 

The proof of Theorem 4 is similar to the proof in [Mcy98] of Theorem 2, and 
exploits the fact, for checking realization of a formula (p, instead of the infinite 
set of runs we can confine our attention to the finite set of traces of length at 
most \p\, which is, intuitively, the furthest that the temporal operators can reach. 
This set of traces has exponentially many elements, but since each trace is of 
linear size we may do model checking within polynomial space using techniques 
of [Mey98]. 

For the language without common knowledge, some new techniques are re- 
quired. Here we obtain the following. 

Theorem 5. The realization problem for K is decidable, with 

non- elementary upper and lower bounds. 

The proof for both the upper and the lower bound can be obtained by re- 
ductions from variants of SnS, the Monadic Second Order Logic of n Succes- 
sors [Tho92], which is interpreted over the infinite tree {1 . .n}*. The proof of 
the lower bound in Theorem 5 is by a reduction from WSIS, a version of SIS in 
which the second order quantifiers are restricted to range over finite sets. It is 
known [Mcy74] that WSIS is decidable with lower bound non-elementary in the 
size of the formula. The upper bound result can be established by a translation 
to the problem of checking the validity of a formula of Chain Logic with the Equal 
Level predicate [Tho92] (or CLE) on tree structures. The logic CLE is an exten- 
sion of a restriction of SnS. Chain Logic (CL) is obtained from SnS by restricting 
the interpretation of the second order quantifiers to sets that are chains, i.e., are 
totally ordered by the prefix relation. The logic CLE is obtained by adding to 
this restriction of SnS the equal level predicate, defined on G {1 . .n}* by 
E{u, v) if |m| = |u|. This approach to the upper bound for the realization problem 
for C{q^u,Ki,...,k„} from this proof is rather indirect, however, as decidability 
of CLE is proved by Thomas [Tho92] using a translation to SIS. The logic SIS 
in turn is known to be decidable using automata theoretic arguments [Biic60]. 
Thus, we have gone from automata (in the definition of environments) to logic, 
and back to automata. In the following section we present a proof of Theorem 5 
that is based directly on automata theoretic constructions, and which yields a 
more informative complexity characterization. 



4 An Algorithm for Bounded Knowledge Depth Formulae 

Our more informative characterization of the complexity of realization for the 
language C{Q^u,Ki,...,Kr,} is cast in terms of the knowledge depth of formulae. 



438 Ron van der Meyden and Nikolay V. Shilov 



i.e., the maximal depth of nesting of knowledge operators in a formula. For ex- 
ample, depth{Ki{QK 2 {q A K 2 r))) = 3. Our approach to the decidability result 
will exploit k-trees, a data structure that has previously been used in the liter- 
ature [Mey98] to represent depth k formulae of holding at a point 

of an environment. We show in this section that fc-trees also encode enough 
information to interpret formulae of with knowledge depth at 

most k. Throughout this section we assume a fixed finite environment E. We 
assume without loss of generality that every trace of E can be extended to a run 
of E.^ 



4.1 Trees 

Intuitively, a fc-tree, for fc > 0, is a type of finite tree of height k in which vertices 
are labelled by states of the environment and edges are labelled by agents. It is 
convenient to represent these trees as follows.^ For numbers fc > 0 we define by 
mutual recursion the set 7^ of k-trees over E, and the set Ek of forests of k-trees 
over E. Define 7 q to be the set of tuples of the form (s,0, ... ,0) where s is a 
state of E and the number of copies of the empty set 0 is equal to the number of 
agents n. Once Tfc has been defined, let Et be the set of all subsets of 7fc. Now, 
define to be the set of all tuples of the form {s,Ui, . . . ,Un), where s is a 
state and Ui is in Ek for each i G {1 . .n}. We denote Ufc>o^ ■ 

Intuitively, in a tuple (s, C/i, . . . , C/„), the state s represents the actual state 
of the environment, and for each i G {1 . . n} the set Ui represents the knowledge 
of agent i. Identifying a 0-tree (s, 0, . . . , 0) with the state s, note that each 
component Ui in a 1-tree is simply a set of states: intuitively, those states agent i 
considers possible. For higher fc, the set Ui represents agent Fs knowledge both 
about the universe and other agents’ knowledge, up to depth fc. 

The elements of Tk correspond in an obvious way to trees of height fc, with 
edges labelled by agents and nodes labelled by states. If w = (s, ?7i, . . . , [/„) we 
define root{w) to be the state s. If w' is an element of Ui, then we say that w' 
is an i-child of w. When w G Tq the labelled tree corresponding to w consists 
of just the root, labelled root(w). The tree corresponding to w G 7fe_|_i has root 
labelled with root{w), and for each z-child w' G Ek of w there is an z-labelled 
edge from the root to a vertex at which the labelled subtree is that corresponding 
to w'. The following result characterises Ck, the number of fc-trees over E. 

Lemma 1. Let fc > 0 he a natural number and E be a finite environment for n 
agents with I states. Then Ck is not greater than exp(n xl,k)/n , where exp(a, b) 
is the function defined by exp(o, 0) = a and exp(a, 6 -I- 1) = . 

® An environment not satisfying this condition can easily by modified (without chang- 
ing its realization properties) by eliminating states that do not belong to any run. 

^ The definitions we give here are (for reasons of clarity and space) a slight simplifica- 
tion of those in [Mey98] , which add some complications to enable fc-trees to be used 
to interpret formulae of alternation depth at most fc, a slightly larger class than the 
class of formulae of knowledge depth at most fc. 



Model Checking Knowledge and Time in Systems with Perfect Recall 439 



For each A: > 0 we may associate with each point (r, m) of E a, k-tree Fk (r, m ) , 
which captures some of the structure of the indistinguishability relations of the 
environment around that point. We proceed inductively. For A: = 0 we define 
Fo(r, m) = (r(m), 0, . . . , 0). For A: > 0 we define Fk{r,m) = (r(m), Ui,. , Um), 
where for each agent i we have Ui equal to the set of fc — 1-trees Fk-i{r' ,m!) 
where is a point of E with ir' ,m') (r,m). 

For each point {r,m) of E let T{r,m) be the trace r(0) . . .r{m). It is not 
difficult to see that for all A: > 0 and for all points {r,m) and with 

T(r,m) = rir' ^m!) we have Fk{r,m) = Fk(r' ,m'). Thus, we may also view Fk 
as a function mapping traces of E to A:-trees, and write Fk (r) where r is a trace 
of E. Note that we exploit the fact that every trace may be extended to a run 
here. 

We now recall from [Mcy98] some functions that may be used to update k- 
trees. These functions were used in [Mey98] to provide an algorithm for the prob- 
lem of model checking at a trace (see above) . We will use these functions below 
to define a sequence of Biichi automata for the realization problem. Let S, T 
and O be the set of states, the transition relation, and the set of observations 
of the environment E, respectively. We define for each number k > 0 the func- 
tion Gfc : Tfc X S' — > Tfc. The definition of Gk will be by mutual recursion with 
the functions Hk^i Tk y. O ^ Tk, where i G {1 . .n} and A: > 0. Intuitively, 
if agent i’s state of knowledge (to depth k) is represented by the the set of k- 
trees U, then Hk^i{U,o) represents the agent’s revised state of knowledge after 
it makes the observation o G O. We define Go{w,s) = (s,0,...,0). Once Gk 
has been defined, we define for each i G {l..n} the function Hk^i by tak- 
ing E[k,i{U,o) to be the set of A:-trees Gk{w,s) where w G U and Oi(s) = o 
and root{w)Ts, i.e., there exists a transition of E from root{w) to s. Using the 
functions Hk^i we may now define Gk+i by setting Gfc+i((s, Ui, . . . , C/„), s') to 
be (s', iffc.i[C/i, Oi(s')], . . . , Hk,n[Un, On (s')]). 

For our definitions (which are a slight variant of those in [Mey98]), we may es- 
tablish the following theorem, essentially the same as a result proved in [Mey98]. 

Theorem 6. For each k >0, and for every finite trace t-s of E with final state s 
and prefix T, we have the incremental update property Fk^r ■ s) = Gk{Fk{T), s). 

The definition of the function Fk above is not effective, ranging over the pos- 
sibly infinite set of runs of E. In the case of traces of length 0 is it easily seen 
how to make it effective, obtaining functions that represent agents’ knowledge 
in the initial states of the environment as a A:-tree. We inductively define map- 
pings /fe : / — > Tfe for A: > 0. In the base case, we put /o(s) = (s, 0, . . . , 0). For 
A; > 0 we define /fc+i(s) to be the k + I-tree with root s and an z-child fk{s') for 
each initial state s' with Oi(s') = Oi(s). 

Lemma 2. For all runs r of E we have Ffc(r, 0) = fk(r{0)). 

Note that, here again, we rely on the fact that every trace (of length 0) can 
be extended to a run. 



440 Ron van der Meyden and Nikolay V. Shilov 



4.2 An Automaton Theoretic Characterization 

We now give an automata-theoretic characterization of realization that forms 
the basis for the algorithm discussed below. 

We begin by defining a type of Biichi automata [Buc60] . The specific variety 
of automata we need are tuples of the form A = {S, /, T, a), where S' is a finite 
set of (control) states, / C S is the set of initial states of the automaton, T C S^ 
is a a transition relation, and a C S is its acceptance condition.® An execution 
of A is an infinite sequence e : N — > S of states of S such that for all m > 0 we 
have e(m) T e(rn+ 1). An execution e is said to be properly initialised if e(0) S I. 
An execution is said to be fair if some state in a occurs infinitely often in the 
execution. A fair, properly initialised execution is called accepting. We also call 
accepting executions runs. The language accepted by the automaton A is the 
set C{A) of all runs of A. 

Given the environment E fixed above, we now define an infinite sequence 
of Biichi automata Ao(B), . . . , Ak{E), . . .. Each automaton Ak{E) defines a lan- 
guage consisting of infinite sequences of fc-trees over that environment. These 
automata will be crucial to the algorithm we develop. 

Let E be (S', /, T, O, tt, a) and fc > 0. Define Ak{E) = {Sk, Ik,Tk, ak) to be 
the Biichi automaton with 

1. Sk equal to the set 7^ of fc-trees over E, 

2. initial states Ik equal to the set of fc-trees fk{s) where s S /, 

3. transition relation Tk defined by wTkw' when there exists state s G S 

such that root{w)Ts and w' = Gk{w,s), 

4. acceptance condition ak defined by ak = {w G Sk ■ root{w) G a}. 

Since Ak{E) is a Biichi automaton on infinite words, the notions of an execu- 
tion, a fair execution, a properly intialised execution and a run of Ak{E) are 
meaningful. We define a projection operation Proj mapping runs r of the au- 
tomata Ak{E) to infinite sequences of states of E, by Proj k(j')(rn) = root(r(m)). 
Conversely, there exists a lift operation Lift/, mapping runs r of the environ- 
ment E to sequences of fc-trees, defined by Liftj.{r){m) = Ffe(r(0) . . .r(m)). The 
proof of the following is straightforward from the definitions. Theorem 6 and 
Lemma 2: 

Lemma 3. For each fc > 0 the mappings Proj f. and Lift/, are inverse functions; 
Projk maps the set of runs of Ak{E) onto the set of runs of E while Lift), maps 
the set of runs of E onto the set of runs ofAk{E). 

We now show that the automata Ak{E) adequately capture the depth fc formulae 
of T{o, holding at points of E. To do so, we define for each fc a relation 

\=k between points in executions of Ak{E) and formulae of knowledge depth < fc. 
The definition of |=fe is by induction on fc, as follows. For the basic propositions 
and the temporal operators, the definition is much like the standard semantics. 
Thus, for an execution e of Ak{E) and m > 0, 

® These are slightly less general than usual, in that the input alphabet and control 
states coincide, and the transition function has a specific form that is derived from 
the transition relation. 



Model Checking Knowledge and Time in Systems with Perfect Recall 441 



E, (e,m) \=k p 
E, (e,m) ifi A tp 2 

E, (e,m) -■p 

E, (e,m) Q(p 

E, (e,m) ipi U if 2 



if 7r(root(e(m)))(p) = 1, where p G Prop, 

if E, (e, m) \=k and E, (e, m) \=k P 2 , 

if not E, (e, m) \=k p, 

if E,{e,m+ 1) \=k p, 

if there exists m” > m such that 

E, (e, m") \=k ^2 and E, (e, m') \=k for all 

m' with m < m' < m" . 



The interesting case concerns the knowledge operators, where we make use of 
the fact that we are dealing with a sequence of fc-trees. It is convenient to first 
define \=k not on points, but on k-trees w, for formulae Kip of >C{o, 
of knowledge depth at most k, by 

E, w \=k Kip if for all k — 1-trees w' that are z-children of w, and for all fair 
executions e of Ak-i{E) such that e(0) = w' , we have E, (e, 0) \=k-i P- 

Note that we consider fair executions e rather than runs in this definition be- 
cause w' is not necessarily an initial state of Ak-i{E). We then define \=k on 
points by 

E, (e, m) \=k K^ip if E, e(m) \=k K^ip. 

The following result establishes a connection between the relations \=k and 
the semantics of in an environment E. 

Lemma 4. For every natural number fc > 0, every formula ip in 

of knowledge depth at most k, for every environment E, every run r of E and 

m > 0 we have E, (r, m) \= ip iff E, (Liftj.{r), m) \=k p- 

This equivalence forms the basis for our decision procedure for realization. 



4.3 The Algorithm 

We are now in a position to present the algorithm for the realization problem for 
Let us first note that in the special case where depth{<p) = 0, 
i.e., formulae not containing the knowledge operators, testing realization amounts 
to a problem of temporal logic to which well-known techniques may straightfor- 
wardly be applied (see below). To generalize to formulae of greater depth, we 
show how to decide the relation E,w \=k Kiip. We achieve this by factoring 
formulae into their temporal and knowledge components. To represent the tem- 
poral components, define a context to be just like a formula of 
but with additional propositional variables from a special separate countable set 
Var. If /3 is a context then we denote by Var{(3) the set of all variables which 
occur in (3. A pure temporal context is a context not containing any occurrences 
of the knowledge operators. 

We separate the temporal and knowledge aspects of formulae by means of 
the following way of “exploding” a formula. Define a pujf to be a finite sequence 
of pairs {seto, mapf) . . . {setm, TnaPm) where the setj are finite sets of pure tem- 
poral contexts, such that Var(seto) = 0, and for j yf f the sets Var(setj) 




442 Ron van der Meyden and Nikolay V. Shilov 



and Var(setji) are disjoint, and the mapj are mappings mapj : Var(setj) 

, Kn} X setj-i. (Thus, mapQ = 0.) The result ip- map ^ of applying a map- 
ping mapj to a context p is defined to be the context obtained by simultaneously 
substituting for each occurrence in p oia variable x € Var(setj) the formula Ki'ijj 
such that mapj{x) = {Ki, 'ip). A puff is said to be a complete separation of a for- 
mula p if setm contains a single context /3, such that p = (3 ■ map^ ■ ■ map^. 

Let cep be a function from formulae to puffs such that cep{vp) is a complete 
separation of p. We call the unique pure temporal context j3 in the top level 
setm of a complete separation of p the temporal skeleton of p. 

Example 1. The puff 

setQ = {p} mapQ = 0 

seti = 1 ^ 2 ;} mapi : x (AT 2 ,p) 

set 2 = {y lA z\ map 2 '. y ^ {Ki,Q)x) 

map2 : x {K 2 , Ox) 

is a complete separation of the formula {Ki Q>K 2 p)U {K 2 ^^K 2 p). The temporal 
skeleton of this formula is yU z. 

A valuation is a partial mapping v : Var — > V{T). This associates with each 
propositional variable the set of trees at which it is true. For each valuation v one 
can extend the relation \=k on formulae to a relation |=^ on temporal contexts 
by simply adding the superscript v to \=k throughout the clauses above, and by 
adding the following clause: 

— A, (e,m) 1=^ X, where x G Var is a variable, iff e(m) G v{x). 

A valuation v is said to be consistent with a puff (seto, mapo) . . . {setm, ’m-o.Pm) 
if for each j G every variable x G Var(setj) and every w G Tj, if 

mapj{x) = {Ki,(3) then w G v{x) iff E,w \=j Kip). 

Lemma 5. Suppose p is a formula of of knowledge depth k. 

Let (3 he the temporal skeleton of p. Suppose that v is a valuation consistent 
with cep(p). Then for all fair executions e of AriE) and for all m > 0, we have 
E, (e, to) hfe T iff E, (e, to) \=l (3. 

By Lemma 4, to determine if, (r, to) |= p for formulae of depth p it suffices 
to decide if, (Liftf^{r), to) \=k p. The effect of Lemma 5 is to reduce the compli- 
cated recursion through fc-trees required to evaluate the latter to the problem 
if, {Lift{r), to) 1=^ p), whose determination involves only temporal steps. 

Thus, we obtain the following approach to deciding realization: (1) represent 
a formula as a puff, (2) construct a consistent valuation for the puff, (3) evaluate 
the temporal skeleton of the formula with respect to this valuation and (4) check 
that the skeleton is valid for all initial states. This approach is formalized in the 
algorithm in Figure 1. By construction and in accordance with the definition of 
consistency, the assertion “v is consistent with {seto, map^) . . . {setj, mapjY'’ is 
an invariant of the loop of the algorithm. Combining this with the results above 



Model Checking Knowledge and Time in Systems with Perfect Recall 443 



INPUT: a finite environment E and a formula ip 
OUTPUT: Y if yj is realized in E, N otherwise. 

PROCEDURE: 

1. Let k := depth(ip) and suppose cep(ip) = (seto, mapo) . . . {setk, mapf.). 

Let V := 0. 

2. For j := 0 to fc — 1 do: 

(a) For all f} G setj, let [/3] := 

{ w G Tj : (e, 0) |=^ /3 for every fair execution e of Aj{E) with e(0) = w }. 

(b) For all 1 G Var{setj+i), ii 'mapj_^_-^{x) = {Ki,l3), 

let [x] := {w G Tj+i : w' G [/3] for all i-children w' of w}. 

(c) Let n := n U H)})- 

3. For temporal skeleton /? of the formula ip let [/3] : = 

{ w € Tk : (e, 0) \=1 f5 for every fair execution e of Ak{E) with e(0) = w }. 

4. If w G [P] for all w £ Tk then output(Y) else output(N). 

Fig. 1. An algorithm for realization 



we conclude that the algorithm is correct. As presented, the algorithm is not yet 
fully operational: we still need to show how it is possible to compute \(3] at steps 
2(a) and 3. This can be done in space polynomial in the size of Ak{E) and (3 
using known techniques [SC85]. 

Theorem 7. The problem of determining if a formula p of n is 

realized in an environment E is decidable in space polynomial in 
\p\ ■ e^p{depth{p),0{\E\)) . 

5 Conclusion 

It is interesting to note that for each of the languages we have considered, the 
complexity we have obtained for the realization problem is the same as the com- 
plexity obtained by Halpern and Vardi [HV88,HV89] for the validity problem. 
While there are some commonalities in the proof techniques used, we are not 
aware of any straightforward reductions between the two problems. 

The problem we have studied here, of checking whether a formula is realized 
in a given environment, is closely related to a problem studied by van der Mey- 
den and Vardi [MV98]. This work also concerns the synchronous perfect recall 
semantics. They deal with a notion of environment in which agents are able to 
choose their actions based on their local state, and the choice of action deter- 
mines the state transitions. They consider the realizability question, of whether 
it is possible to decide the existence of (and if so, construct) a protocol (a func- 
tion from local state to actions) for an agent such that running this protocol in a 
given environment generates a system realizing a given specification in the logic 
of knowledge and time. By contrast with our results in this paper, however, re- 



444 Ron van der Meyden and Nikolay V. Shilov 



alizability is decidable only in the single agent case, even for environments with 
trivial acceptance condition. 

Model checking a logic of knowledge and time has also been studied by Clarke 
et al. [CJM98] in the context of verification of cryptographic protocols. Their 
work assumes a semantics of knowledge very different from ours, based on ex- 
plicit computation rather than the information theoretic notion we have studied. 
An interesting topic for further research is the applicability of our results, or 
adaptations thereof, to verification of cryptographic protocols. 

Several dimensions of generalisation of our results are worth considering. In 
particular, it would be desirable to know if our results generalise to the case 
of languages with past time or branching time operators — this generalisation 
is able to express the problem of checking that a given finite state protocol 
implements a given knowledge based program [FHMV95,FHMV97] in a given 
environment, in such a way that agents operate as if they had perfect recall. 
A knowledge based program is a type of specification that describes how an 
agent’s actions relate to its state of knowledge. Verification of knowledge based 
programs for finite state definitions of knowledge has been shown decidable by 
Vardi [Var96]. The perfect recall case remains to be addressed, although the 
linear time case we have presented is already able to yield this result in the special 
case of deterministic knowledge-based programs [MV98]. Also of interest is the 
complexity of realization with respect to other natural definitions of knowledge, 
such as the asynchronous perfect recall semantics. 



References 



Biic60. 

CGP99. 

CJM98. 

FHMV95. 

FHMV97. 

HM90. 

HV88. 



J.R. Biichi. On a decision method in restricted second order arithmetic. In 
Proc. Intemat. Congr. on Logic, Methodology and Philosophy of Science, 
pages 1-11, Stanford, CA, 1960. Stanford Univ. Press. 437, 440 
E.M. Clarke, O. Grumberg, and D. Peled. Model Checking. MIT Press, 
Gambridge, MA, 1999. 432 

E. Clarke, S. Jha, and W. Marrero. A machine checkable logic of knowledge 

for specifying security properties of electronic commerce protocols. In LICS 

Workshop on Formal Methods and Security Protocols, 1998. 444 

R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about 

Knowledge. MIT Press, 1995. 432, 433, 433, 435, 444 

R. Fagin, J. Y. Halpern, Y. Moses, and M. Y. Vardi. Knowledge-based 

programs. Distributed Computing, 10(4): 199-225, 1997. 444 

J. Y. Halpern and Y. Moses. Knowledge and common knowledge in a 

distributed environment. Journal of the ACM, 37(3):549-587, 1990. 432, 

435, 435 

J. Y. Halpern and M. Y. Vardi. The complexity of reasoning about knowl- 
edge and time: synchronous systems. Research Report RJ 6097, IBM, 1988. 
443 



HV89. J. Y. Halpern and M. Y. Vardi. The complexity of reasoning about knowl- 
edge and time, I: lower bounds. Journal of Computer and Systems Science, 
38(1): 195-237, 1989. 443 



Model Checking Knowledge and Time in Systems with Perfect Recall 445 



HV91. J. Y. Halpern and M. Y. Vardi. Model checking vs. theorem proving: a 
manifesto. In V. Lifschitz, editor, Artificial Intelligence and Mathematical 
Theory of Computation (Papers in Honor of John McCarthy), pages 151- 
176. Academic Press, San Diego, Calif., 1991. 433 
Mey74. A. R. Meyer. The inherent complexity of theories of ordered sets. In Proc. 

of the Int. Congr. of Mathematics, volume 2, pages 477-482, Vancouver, 
1974. Canadian Mathematical Congress. 437 
Mey98. R. van der Meyden. Common knowledge and update in finite environments. 

Information and Computation, 140(2):115-157, 1998. 433, 433, 433, 433, 

436, 436, 437, 437, 438, 438, 439, 439, 439, 439 
MP91. Z. Manna and A. Pnueli. The Temporal logic of Reactive and Concurrent 
Systems. Springer- Verlag, Berlin, 1991. 432 
MV98. R. van der Meyden and M. Y. Vardi. Synthesis from knowledge-based spec- 
ifications. In Proc. CONCUR’98, 9th International Conf. on Concurrency 
Theory, pages 34-49. Springer LNCS No. 1466, 1998. 443, 444 
SC85. A. P. Sistla and E. M. Clark. The complexity of propositional linear tem- 
poral logic. Journal of the ACM, 32(3):733-749, 1985. 443 
Tho92. W. Thomas. Infinite trees and automaton-definable relations over oi-words. 

Theoretical Computer Science, 103:143-159, 1992. 433, 437, 437, 437 
Var96. M. Y. Vardi. Implementing knowledge-based programs. In Proc. of the 
Conf. on Theoretical Aspects of Rationality and Knowledge, pages 15-30, 
San Mateo, CA, 1996. Morgan Kaufmann. 433, 444 



The Engineering of Some Bipartite Matching 

Programs 



Kurt Mehlhorn 

Max-Planck-Institut fur Informatik, 

Im Stadtwald, 66123 Saarbriicken, Germany, 

WWW .mpi-sb.mpg.de/~mehlhorn 

Over the past years my research group was involved in the development of 
three algorithm libraries: 

— LEDA, a library of efficient data types and algorithms [LED] 

— GOAL, a computational geometry algorithms library [CGA], and 

— AGD, a library for automatic graph drawing [AGDj. 

In this talk I will discuss some of the lessons learned from this work. I will do 
so on the basis of the LEDA-implementations of bipartite cardinality matching 
algorithms. The talk is based on Section 7.6, pages 360-392, of [MN99]. In this 
book Stefan Naher and I give a comprehensive treatment of the LEDA system 
and its use. We treat the architecture of the system, we discuss the functionality 
of the data types and algorithms available in the system, we discuss the imple- 
mentation of many modules of the system, and we give many examples for the 
use of LEDA. 

My personal level of involvement was very different in the three projects: 
The LEDA project was started in 89 by Stefan Naher and myself and I have 
been involved as a designer and system architect, implementer of algorithms 
and tools, writer of documentation and tutorials, and user of the system. For 
GGAL I acted as an advisor and for AGD my involvement was marginal. 

The bipartite cardinality matching problem asks for the computation of a 
maximum cardinality matching M in a bipartite graph {AO B, E). A matching 
in a graph G is a subset M of the edges of G such that no two share an endpoint. 

I will discuss the following points: 

— Specification: We discuss several specifications of the problem and discuss 
their relative merits, in particular, with respect to verification and flexibility. 

— Ghecking and verification: We discuss how a matching algorithm can justify 
its answers and how answers can be checked. 

— Representations of matchings: We discuss how matchings can be represented 
and what the relative merits of the representations are. 

— Reinitialization in iterative algorithms: Most matching algorithms work in 
phases. We discuss how to reinitialize data structures in a cost-effective way. 

— Search for augmenting paths by depth-first search or by breadth-first search: 
We discuss the relative merits of the two methods. 



C. Pandu Rangan, V. Raman, R. Ramanujam (Eds.); FSTTCS’99, LNCS 1738, pp. 446—449, 1999. 
(c) Springer-Verlag Berlin Heidelberg 1999 



The Engineering of Some Bipartite Matching Programs 447 



n 


m 


k 


FFB 


dfs- 


dfs+ 


bfs- 


bfs-t 


HK- 


HK-b 


AB- 


AB+ 


Check 


2 


4 


1 


311.1 


1.63 


1.14 


1.08 


0.93 


1.5 


1.42 


0.94 


0.96 


0.09 


2 


4 


50 


319.2 


1.24 


0.65 


0.71 


0.58 


1.09 


0.99 


0.7599 


0.77 


0.07001 


2 


4 


2500 


316.1 


0.35 


0.32 


0.32 


0.3 


1.01 


0.92 


0.69 


0.68 


0.07001 


2 


6 


1 


404 


24.06 


6.72 


18.97 


7.1 


2.35 


2.24 


1.29 


1.26 


0.1001 


2 


6 


50 


397 


71 


15.22 


12.29 


7.57 


1.76 


1.67 


0.97 


0.95 


0.08997 


2 


6 


2500 


313.9 


3.15 


1.12 


0.6902 


0.6299 


2.37 


2.29 


0.78 


0.76 


0.08008 


2 


8 


1 


364.8 


7.42 


2.61 


7.56 


3.71 


2.71 


2.55 


2.84 


2.95 


0.1199 


2 


8 


50 


360 


34.15 


12.47 


9.35 


7.59 


2.19 


2.1 


1.68 


1.64 


0.09985 


2 


8 


2500 


360.2 


42.8 


10.47 


1.9 


1.76 


2.9 


2.74 


1.23 


1.22 


0.08984 


4 


8 


1 


- 


4.43 


3.23 


3.03 


2.64 


3.3 


3.05 


2.47 


2.47 


0.1699 


4 


8 


50 


- 


2.95 


1.84 


1.76 


1.52 


2.84 


2.62 


1.92 


1.91 


0.1501 


4 


8 


2500 


- 


0.9202 


0.6599 


0.71 


0.6599 


2.38 


2.18 


1.69 


1.66 


0.1501 


4 


12 


1 


- 


108 


27.77 


87.78 


29.59 


5.49 


5.22 


3.44 


3.43 


0.21 


4 


12 


50 


- 


317.5 


67.39 


57.71 


34.76 


3.86 


3.65 


2.62 


2.53 


0.1699 


4 


12 


2500 


- 


291.2 


77.17 


29.31 


22.05 


4.19 


3.92 


2.09 


2.05 


0.1699 


4 


16 


1 


- 


23.81 


9.01 


26.91 


10.93 


5.3 


4.93 


2.66 


2.66 


0.25 


4 


16 


50 


- 


205.2 


59.7 


46.32 


38.94 


4.92 


4.62 


2.27 


2.31 


0.2002 


4 


16 


2500 


- 


470.3 


105.3 


16.62 


14.71 


4.46 


4.23 


2.09 


2.1 


0.1699 


8 


16 


1 


- 


- 


- 


6.27 


7.57 


8.28 


7.75 


6.76 


6.61 


0.37 


8 


16 


50 


- 


- 


- 


4.5 


4.22 


5.8 


5.55 


5.13 


5.13 


0.31 


8 


16 


2500 


- 


- 


- 


1.66 


1.4 


4.69 


4.42 


4.33 


4.26 


0.28 


8 


24 


1 


- 


- 


- 


378.3 


116.5 


12.54 


12.01 


9.77 


9.52 


0.45 


8 


24 


50 


- 


- 


- 


248.2 


152.6 


9.94 


9.51 


7.39 


7.32 


0.36 


8 


24 


2500 


- 


- 


- 


118.2 


82.81 


6.8 


6.41 


5.48 


5.42 


0.3301 


8 


32 


1 


- 


- 


- 


109.6 


39.3 


12.2 


11.63 


9.87 


9.84 


0.39 


8 


32 


50 


- 


- 


- 


181.9 


157.6 


10.47 


10.08 


7.4 


7.36 


0.39 


8 


32 


2500 


- 


- 


- 


63.56 


50.97 


9.9 


9.54 


5.53 


5.48 


0.37 



Table 1. Running times of our matching algorithms. The first columns show 
the values of n/10^ and m/10^, respectively. The meaning of the other columns 
is explained in the text. A dash indicates that the program was not run on the 
instance. 



— The use of heuristics to find an initial solution: Matching algorithms can 
either start from an empty matching or can use a heuristic to construct an 
initial matching. 

— Simultaneous search for augmenting paths: The fastest matching algorithms 
search for augmenting paths in order of increasing length. 

— Documentation: We discuss the merits of literate programming for documen- 
tation and why we use it document our implementations. 

Figure 1 shows the running times of our bipartite matching algorithms; the 
source code of all our implementations can be found in [MN99]. A plus sign 
indicates the use of the greedy heuristic for finding an initial matching and a 
minus sign indicates that the algorithm starts with the empty matching. The 



448 



Kurt Mehlhorn 



algorithms HK [HK73] and AB [ABMP91] have a worst case running time of 
0{y/nm) and the other algorithms have a worst case running time of 0{nm). 
FFB stands for the basic version of the Ford and Fulkerson algorithm [FF63]. 
It runs in n phases, uses depth-first-search for finding augmenting paths and 
uses 0{n) time at the beginning of each phase for initialization. Its best case 
running time is 0{n?). The algorithms dfs and bfs are variants of the Ford and 
Fulkerson algorithm. They avoid the costly initialization at the beginning of each 
phase and use depth-first and breadth-first search, respectively. The algorithms 
HK and AB use breadth-first search and search for augmenting paths in order 
of increasing length. The last column shows the time to check the result of the 
computation. 

We used bipartite group graphs Gn,m,k, as suggested by [CGM+97] in their 
experimental study of bipartite matching algorithms, for our experiments. A 
graph Gn,m,k has n nodes on each side. On each side the nodes are divided 
into k groups of size n/k each (this assumes that k divides n). Each node in A 
has degree d = m/n and the edges out of a node in group i of A go to random 
nodes in groups z -I- 1 and i — 1 of B. 

The running times our algorithms differ widely. We observe (the book at- 
tempts to explain the observations, but we will not do so here) that the program 
with the quadratic best case running time is much slower than the other im- 
plementations, dfs is almost always slower than bfs and frequently much slower, 
that the use of the heuristic helps and the advantage is more prominent for the 
slower algorithms, and that the asymptotically better algorithms are never much 
slower than the asymptotically slower algorithms and sometimes much better. 
We also see that the time for checking the result of the computation is negligable. 

Table 1 is a strong case for algorithm engineering and its interplay with 
the theoretical investigation of algorithms. We have algorithms with the same 
asymptotic bounds and widely differing observed behavior. The differences can 
be explained, sometimes analytically and sometimes heuristically, coined into 
implementation principles, and applied to other algorithms. See Sections 7.7 on 
maximum cardinality matching in general graphs, 7.8 on weighted matchings in 
bipartite graphs, and 7.9 on weighted matchings in general graphs of [MN99] to 
see how we applied the lessons learned from bipartite cardinality matchings to 
other matching problems. 

References 

ABMP91. H. Alt, N. Blum, K. Mehlhorn, and M. Paul. Compu ting a ma ximum cardi- 
nality matching in a bipartite graph in time 0{n}'^ y^m/ log n). Information 
Processing Letters, 37:237-240, 1991. 448 

AGD. The AGD graph drawing library. http://www.mpi-sb.mpg.de/AGD/. 446 
CGA. CGAL (Compntational Geometry Algorithms Library). 
www.cs.ruu.nl/CGAL. 446 

GGM^97. B. Cherkassky, A. Goldberg, P. Martin, J. Setubal, and J. Stolfi. Augment 
or relabel? A computational stndy of bipartite matching and unit capacity 
maximum flow algorithms. Technical Report TR 97-127, NEG Research 
Institute, 1997. 448 



The Engineering of Some Bipartite Matching Programs 449 



FF63. L.R. Ford and D.R. Fulkerson. Flows in Networks. Princeton University 
Press, Princeton, NJ, 1963. 448 

HK73. J.E. Hopcroft and R.M. Karp. An algorithm for maximum matchings 
in bipartite graphs. SIAM Journal of Computing, 2(4):225-231, 1973. 448 
LED. LEDA (Library of Efficient Data Types and Algorithms). 

www.mpi-sb.mpg.de/LEDA/leda.html. 446 
MN99. K. Mehlhorn and S. Naher. The LEDA Platform for Combinatorial and 
Geometric Computing. Cambridge University Press, 1999. 446, 447, 448 



Author Index 



r n 1 

Ir rn s 1 

m o o ro 0 

mo nr 

z h s 

h 01 

o ol 

o ol r r 0 

s 11 n Ir 1 

hnrn . nl 

h hr 

on 

orr n 1 o 1 

’ o z p 60 

ly r 6 

ol o o 1 

m n r 

nn r m 01 

"r n 1 

s ol n 16 

r r ol 7 

rm"ll r hr s n 

o rn r 1 

r n 1 

loor r 6 

on h r or s 1 

n m nn rnh r 0 

rrm nn Ph 1 pp 7 

h h 1 6 

n hn 1 

oy nr s 

nr n 1 



ohnson r 1 

K r 1 6 

11 nn 1 

n r r 6 

y n q s 11 

ho ss n r 0 

6 n hr so 7 

ms 1 

h ynh 1 

rn 11 

rh p ml IroOl 

hlhorn K r 6 

y n on n r 

os o ho r 

hop hy y pr 

n h n 1 

nro n 

rr y 1 1 



n y h s 



h m on 16 

osh 16 



Po Is n r s 



yn 1 hi 

osnhl r 1 

rho r n 

h n Ih r hr s n 

hi h K rl 1 

h r or 1 

1 n h 1 

hrr h 1 

h lo ol y 

In 1 6 




452 



Author Index 



romon y n 6 h Im . . . . 

m h 1 1 

“y ^ 1 1 Thom s ... 

Th r n P 60 1 

To nos 1 m m Tomoy 



110 

07 




